Mastodawn

#Docling sounds pretty interesting on their website (https://docling-project.github.io/docling/), but after having played around with it for a bit, I found the JSON/Markdown/HTML results pretty disappointing.

OCR was mediocre to bad, table/heading/list recognition too. It didn't even add line breaks between the lines in the address part of a letter.

But I'm using the defaults. Any suggestions on, like, different models or engines and stuff?

Docling - Docling

scy May 27

Yeah, colors and formatting in CLI tools is usually a good thing, but if your --help looks like this, you probably need to take a step back.

#docling #CLI #Python

Carol Chen May 13

Check out the #Docling Meetup during the week of #RHSummit in #Boston - a great chance for the community to connect with Docling experts and dive into the latest innovations!

Tuesday, May 20, 2025
6:00 to 8:00 PM EDT

Details & RSVP: https://ibm.biz/DoclingCommunity

#opensource #redhat #ibm #genAI

Docling Community Exchange , Tue, May 20, 2025, 6:00 PM | Meetup

Join us for an in person, interactive evening with refreshments, where you’ll have the opportunity to connect directly with the creators of **Docling** through collaborativ

Meetup

olеg lаvrоvsky May 9

Taking part in the #Docling workshop at the #OpenSource AI conference. This is a project I heard about at #DINAconCH a few months ago, and it seems to since have exploded in popularity on PyPi and GitHub - in part thanks to the #CHopen community ⛹️‍♂️

There are strong overlaps with what I've been doing at #ProxeusApp - my notes from the Docling deep-dive have been posted here: https://log.alets.ch/105/

105 #docling

Notes from a workshop session at the Open Source AI Conference at BFH in Bern on May 9, 2025, organised by CH Open.

define:aletsdat

InstructLab Apr 30

Check out the sessions in the AI track on #RHSummit Community Day!

https://events.experiences.redhat.com/widget/redhat/sum25/SessionCatalog2025?tab.day=20250519&search.communityday=option_1737580301897

We have topics ranging from #Docling to #TrustyAI, inferencing to features stores, topped with your favourite #InstructLab tools and #Granite models. Register and add the sessions to your schedule!

Red Hat Summit 2025

Red Hat Summit is the premier enterprise open source event for IT professionals

Hacker News Apr 22

Transforming Your PDFs for RAG with Open Source Using Docling, Milvus, and Feast

https://github.com/feast-dev/feast/tree/master/examples/rag-docling

#HackerNews #Transforming #Your #PDFs #for #RAG #with #Open #Source #Using #Docling #Milvus #and #Feast

open-source #RAG #PDFs #Docling #Milvus #Feast

feast/examples/rag-docling at master · feast-dev/feast

The Open Source Feature Store for AI/ML. Contribute to feast-dev/feast development by creating an account on GitHub.

GitHub

Markus Eisele Apr 18

Simplify AI data integration with RamaLama and RAG
https://developers.redhat.com/articles/2025/04/03/simplify-ai-data-integration-ramalama-and-rag#
#Docling #Ramalama #podman #aiml

Simplify AI data integration with RamaLama and RAG | Red Hat Developer

Explore how RamaLama makes it easier to share data with AI models using retrieval-augmented generation (RAG), a technique for enhancing large language models

Red Hat Developer

Habr Mar 22

Как я победил в RAG Challenge: от нуля до SoTA за один конкурс

Когда новичок пытается построить свою первую вопросно-ответную LLM систему, он быстро узнаёт, что базовый RAG - это для малышей и его нужно "прокачивать" модными техниками: Hybrid Search, Parent Document Retrieval, Reranking и десятки других непонятных терминов. Глаза разбегаются, наступает паралич выбора, ладошки потеют. А что, если попробовать их все? Я решил потратить на подготовку к соревнованию 200+ часов и собственноручно проверить каждую из этих методик. Получилось настолько удачно, что я выиграл конкурс во всех номинациях. Теперь рассказываю, какие техники оказались полезными, а какие нет, и как повторить мой результат.

https://habr.com/ru/articles/893356/

#RAG #Docling #векторный_поиск #retrieval_augmented_generation #question_answering #LLM #FAISS #GPT #ChatGPT #парсинг_PDF

Как я победил в RAG Challenge: от нуля до SoTA за один конкурс

Автор - DarkBones Предисловие В этом посте я расскажу про подход, благодаря которому я занял первое место в обеих призовых номинациях и в общем SotA рейтинге. В чём суть RAG Challenge? Нужно создать...

Хабр

Václav Vančura Mar 19

Seeing the #Docling Actor (that @netmilk and I wrote) listed on the Docling Featured Integrations page is pretty cool! https://docling-project.github.io/docling/integrations/apify/

Apify - Docling

Xavier «X» Santolaria

Mar 19

LOVE it 💙

https://www.ibm.com/new/announcements/ibm-adds-open-source-projects-docling-beeaI-and-data-prep-kit-added-to-the-linux-foundation

#ai #tech #opensource #ibm #beeai #docling #dataprepkit #linuxfoundation

IBM contributes key open-source projects to Linux Foundation to advance AI community participation

IBM is contributing 3 open-source projects—Docling, Data Prep Kit and BeeAI—to the Linux Foundation. This move signals not only the potential growth of these projects but also IBM’s ongoing commitment to open-source AI.