Baidu Inc. (@Baidu_Inc)

Qianfan-OCR 소개: 문서 인텔리전스를 위한 4B 파라미터 엔드투엔드 모델로, 하나의 모델만으로 파이프라인 없이 테이블 추출, 수식 인식, 차트 이해, 핵심 정보 추출 등을 단일 패스로 처리한다. 관련 논문(arXiv) 링크가 제공됨.

https://x.com/Baidu_Inc/status/2034265136182202765

#qianfanocr #ocr #documentintelligence #arxiv

Executives say AI boosts productivity but the real gain is just 16 minutes per week

https://fed.brid.gy/r/https://nerds.xyz/2026/03/foxit-ai-productivity-study/

RAG pipelines often skip crucial data like voltage limits because they slice documents poorly. A new approach—semantic chunking with layout parsing—keeps context intact, boosting document intelligence on Azure. See how this open‑source tweak can sharpen retrieval‑augmented generation for engineers and researchers. #RAG #SemanticChunking #DocumentIntelligence #AzureAI

🔗 https://aidailypost.com/news/rag-systems-miss-data-like-voltage-limits-semantic-chunking-proposed

Từ dự án phụ thành sản phẩm thật! Mình đã phát triển ứng dụng trí tuệ tài liệu giúp chuyên gia tiết kiệm thời gian tra cứu thông tin kỹ thuật/quy chuẩn trong các tập tài liệu lớn. Đang tuyển thử nghiệm beta và cộng tác viên để phát triển tính năng mới. Ai quan tâm?
#AI #TrienKhaiPhanMem #DocumentIntelligence #ThuNghiem #VietTech #SaaS #CTU

https://www.reddit.com/r/SaaS/comments/1pyy0no/from_side_project_to_real_product_looking_for/

Azure Content Understanding is now generally available | Microsoft Foundry Blog

At Microsoft Ignite this year, we’re excited to announce that Azure Content Understanding in Foundry Tools is now generally available (GA). Over the past months, we’ve seen preview usage across industries, from large consultancies to healthcare leaders, with invaluable customer feedback shaping this release. With this GA release, we’re enabling flexibility and control with model […]

Microsoft Foundry Blog
Azure AI Document Intelligenceの前処理:Word/Excelのルビ(ふりがな)を除去する - Qiita

はじめに Azure AI Document Intelligence は、ドキュメントからテキストや表などの構造を機械可読な形で抽出できる強力なサービスです。レイアウト(テキスト・表・チェックボックス等)の抽出に対応し、OCRと深層学習を組み合わせて文書構造を取り出せま...

Qiita
Azure AI Document Intelligence Freeプラン利用時の注意点 - Qiita

はじめに デジタルトランスフォーメーション(DX)が加速する現代において、ドキュメント処理の自動化は多くの企業にとって重要な課題です。特に、請求書や契約書といった非構造化データからの情報抽出は、生産性向上の鍵を握ります。 この課題に対する有力な解決策の一つとして、Micr...

Qiita

Azure Document Intelligenceのレイアウトモデルを使ってPDFをMarkdownに変換しRAG用にセマンティックチャンキングを試す
https://qiita.com/h_id32/items/1821bc62f8bb445b6b6d?utm_campaign=popular_items&utm_medium=feed&utm_source=popular_items

#qiita #Azure #rag #langchain #DocumentIntelligence

Azure Document Intelligenceのレイアウトモデルを使ってPDFをMarkdownに変換しRAG用にセマンティックチャンキングを試す - Qiita

はじめにこの記事では、Azure Document Intelligenceのレイアウトモデルを使用してPDFをMarkdownに変換し、さらにRAGのためのセマンティックチャンキングを実装する方…

Qiita
GraphRAGの簡易検証 ~Azure Document Intelligence, Neo4jを用いて~ - Qiita

はじめに本記事の背景Retrieval-Augmented Generation(RAG)は、クエリに基づいた情報検索を行い、その結果を基に回答を生成する技術です。これは大規模言語モデル(LLM…

Qiita
If anyone works with #azure #DocumentIntelligence #FormRecognizer this seems like a trivial thing but how do I reproduce the "Results" tab from #DocumentIntelligenceStudio Attempting to dump/serialize (python/C#) the poller.results throws an exception about unserializable object