I am re-running the (albeit laughably named) National Electronic Sectional Appendix (NESA) Section D route clearance tables extract
I now also remember why I run it so infrequently. It scrapes text from PDF files and dumps them into TSV and XLSx and it's a bin-fire from top to bottom
Part of the problem is that it is some of the ugliest code I have ever written. Grey scale breaks the PDF text field identification? Use GhostScript to hack the PDF file
1/n
