#Docling sounds pretty interesting on their website (https://docling-project.github.io/docling/), but after having played around with it for a bit, I found the JSON/Markdown/HTML results pretty disappointing.

OCR was mediocre to bad, table/heading/list recognition too. It didn't even add line breaks between the lines in the address part of a letter.

But I'm using the defaults. Any suggestions on, like, different models or engines and stuff?

Docling - Docling

Yeah, colors and formatting in CLI tools is usually a good thing, but if your --help looks like this, you probably need to take a step back.

#docling #CLI #Python

Check out the #Docling Meetup during the week of #RHSummit in #Boston - a great chance for the community to connect with Docling experts and dive into the latest innovations!

Tuesday, May 20, 2025
6:00 to 8:00 PM EDT

Details & RSVP: https://ibm.biz/DoclingCommunity

#opensource #redhat #ibm #genAI

Docling Community Exchange , Tue, May 20, 2025, 6:00 PM | Meetup

Join us for an in person, interactive evening with refreshments, where you’ll have the opportunity to connect directly with the creators of **Docling** through collaborativ

Meetup

Taking part in the #Docling workshop at the #OpenSource AI conference. This is a project I heard about at #DINAconCH a few months ago, and it seems to since have exploded in popularity on PyPi and GitHub - in part thanks to the #CHopen community ⛹️‍♂️

There are strong overlaps with what I've been doing at #ProxeusApp - my notes from the Docling deep-dive have been posted here: https://log.alets.ch/105/

105 #docling

Notes from a workshop session at the Open Source AI Conference at BFH in Bern on May 9, 2025, organised by CH Open.

define:aletsdat

Check out the sessions in the AI track on #RHSummit Community Day!

https://events.experiences.redhat.com/widget/redhat/sum25/SessionCatalog2025?tab.day=20250519&search.communityday=option_1737580301897

We have topics ranging from #Docling to #TrustyAI, inferencing to features stores, topped with your favourite #InstructLab tools and #Granite models. Register and add the sessions to your schedule!

Red Hat Summit 2025

Red Hat Summit is the premier enterprise open source event for IT professionals

feast/examples/rag-docling at master · feast-dev/feast

The Open Source Feature Store for AI/ML. Contribute to feast-dev/feast development by creating an account on GitHub.

GitHub
Simplify AI data integration with RamaLama and RAG | Red Hat Developer

Explore how RamaLama makes it easier to share data with AI models using retrieval-augmented generation (RAG), a technique for enhancing large language models

Red Hat Developer

Как я победил в RAG Challenge: от нуля до SoTA за один конкурс

Когда новичок пытается построить свою первую вопросно-ответную LLM систему, он быстро узнаёт, что базовый RAG - это для малышей и его нужно "прокачивать" модными техниками: Hybrid Search, Parent Document Retrieval, Reranking и десятки других непонятных терминов. Глаза разбегаются, наступает паралич выбора, ладошки потеют. А что, если попробовать их все? Я решил потратить на подготовку к соревнованию 200+ часов и собственноручно проверить каждую из этих методик. Получилось настолько удачно, что я выиграл конкурс во всех номинациях. Теперь рассказываю, какие техники оказались полезными, а какие нет, и как повторить мой результат.

https://habr.com/ru/articles/893356/

#RAG #Docling #векторный_поиск #retrieval_augmented_generation #question_answering #LLM #FAISS #GPT #ChatGPT #парсинг_PDF

Как я победил в RAG Challenge: от нуля до SoTA за один конкурс

Автор - DarkBones Предисловие В этом посте я расскажу про подход, благодаря которому я занял первое место в обеих призовых номинациях и в общем SotA рейтинге. В чём суть RAG Challenge? Нужно создать...

Хабр
Seeing the #Docling Actor (that @netmilk and I wrote) listed on the Docling Featured Integrations page is pretty cool! https://docling-project.github.io/docling/integrations/apify/
Apify - Docling

IBM contributes key open-source projects to Linux Foundation to advance AI community participation

IBM is contributing 3 open-source projects—Docling, Data Prep Kit and BeeAI—to the Linux Foundation. This move signals not only the potential growth of these projects but also IBM’s ongoing commitment to open-source AI.