Mastodawn

ParseBench is here!📊

We’ve just released ParseBench, an open benchmark + dataset for evaluating document parsing at scale.

It includes:
• 2,000+ human-reviewed enterprise documents
• 167,000 evaluation rules
• Coverage across 5 key areas: tables, charts, content faithfulness, semantic formatting, and visual grounding

Show thread

Clelia Bertelli

What makes it different?
ParseBench optimizes for semantic correctness, not exact text matching.
Explore more:
📖 Blog: https://www.llamaindex.ai/blog/parsebench
💻 Code: https://github.com/run-llama/ParseBench
🤗 Dataset: https://huggingface.co/datasets/llamaindex/ParseBench

ParseBench: The First Document Parsing Benchmark for AI Agents

Introducing ParseBench 2,000+ human-verified pages and 167K test rules to evaluate document OCR across tables, charts, formatting, and more for AI agents. Open source.