Print Night Out with Doctr & Radio Tornado @ De Verbroederij - 23 Oct feat. Doctr, Alix Dion
#Doctr muss ich mir mal anschauen, verwende für mein aktuelles Projekt #RapidOCR , hat mir die KI so vorgeschlagen für mein #Python Projekt.
In the first stage, I'm using #PaddleOCR
https://github.com/PaddlePaddle/PaddleOCR
Their doc says they support Windows, macOS and Linux. For simplicity, I wrapped the python dependency into podman/docker, so it's Linux-only for now. If there are potential users other than me, I guess it won't be too hard to make it cross platform.
https://github.com/Endle/beanbeaver-ocr
Before PaddleOCR, I first tried #docTR
https://github.com/mindee/doctr
Some Reddit posts claimed that docTR was the best. It was pretty well for English (Latin characters), but it doesn't support Chinese. It would try to recognize a Chinese character as a combination of Latin characters with a relatively high confidence.
PaddleOCR supports Chinese recognize, but I turned it to English-only mode. For the T&T receipt I showed, PaddleOCR provides a very low confidence to Chinese words (https://github.com/Endle/beanbeaver/blob/master/demo/receipt_groups/tnt_20251202/receipt_20260217_200222_debug.png), so beanbeaver can parse this bilingual receipt by the English parts
Print Night Out with Doctr & Radio Tornado @ De Verbroederij - 23 Oct feat. Doctr, Alix Dion