Mastodawn

Simon Willison Mar 30, 2024

I built a new tool: https://tools.simonwillison.net/ocr - it runs OCR against images and PDFs entirely in your browser (no file upload needed) using Tesseract.js and PDF.js

I wrote more about the tool and how I built it (with copious amounts of Claude 3 Opus and a little bit of ChatGPT) here: https://simonwillison.net/2024/Mar/30/ocr-pdfs-images/

OCR PDFs and images directly in your browser

Show thread

Kaveinthran (no longer here)Mar 31, 2024

@simon Hi are you also looking at upgrading the OCR engines, probably using VikParuchuri/surya https://github.com/VikParuchuri/surya

GitHub - VikParuchuri/surya: OCR, layout analysis, reading order, table recognition in 90+ languages

OCR, layout analysis, reading order, table recognition in 90+ languages - VikParuchuri/surya

GitHub

Show thread

Kaveinthran (no longer here)

@simon Also can add Surya layout https://x.com/vikparuchuri/status/1772700744673583424?s=46&t=LkACU5SURZ83u1uLZ5iBYw

Vik Paruchuri (@VikParuchuri) on X

Announcing surya layout! It detects tables, images, figures, section headers, and more. It works with any language, and a variety of document types. Find it here - https://t.co/DD2HfwI8jK . Thanks @LambdaAPI for sponsoring compute.

X (formerly Twitter)