I built a new tool: https://tools.simonwillison.net/ocr - it runs OCR against images and PDFs entirely in your browser (no file upload needed) using Tesseract.js and PDF.js

I wrote more about the tool and how I built it (with copious amounts of Claude 3 Opus and a little bit of ChatGPT) here: https://simonwillison.net/2024/Mar/30/ocr-pdfs-images/

OCR PDFs and images directly in your browser

@simon Very cool. Though I get a Heroku error when I try to go to your site ("Application error: An error occurred in the application and your page could not be served. If you are the application owner, check your logs for details. You can do this from the Heroku CLI with the command heroku logs --tail")
@aaronjschaffer Huh... it looks like it's the Mastodon effect, where sending out a link causes thousands of Mastodon servers to all hit /.well-known/webfinger?resource=acct:[email protected] at the same time - but I've survived these storms just fine in the past, not sure why it's hurting the site today
@simon Ah gotcha! I love a little suspense, I'll check again later!