Mastodawn

Gibt es Statistiken darüber wieviele Ingenieursstunden in die Rückgewinnung von maschinenlesbaren Daten mittels Tools wie z.B. #pdfplumber geflossen sind weil in Deutschland die Digitalisierung bei PDFs von Exceltabellen aufhört?

#datamining

Kevin Veen-Birkenbach May 31, 2025

🩺 Fixed broken #PDF s in seconds? Yes, you can!
Check out pdf-healer:
https://github.com/kevinveenbirkenbach/pdf-healer

I built this tool after running into mysterious "#Ascii85 decode errors" in bank statements and official PDFs – especially when using Python tools like pdfminer, pdfplumber, or #moneymonitor.
With pdf-healer you can detect and batch-fix these corrupted files with a single command. Perfect for anyone who archives, processes, or automates PDFs!

#opensource #python #pdfplumber #pdfminer #qpdf 🛠️📄

GitHub - kevinveenbirkenbach/pdf-healer: A command-line tool for batch-repairing PDF files with Ascii85 decode errors using qpdf. Scans folders or single files, supports preview, overwrite, and copy modes.

A command-line tool for batch-repairing PDF files with Ascii85 decode errors using qpdf. Scans folders or single files, supports preview, overwrite, and copy modes. - kevinveenbirkenbach/pdf-healer

GitHub