I'm seeking to generalise my AIP processing setup and to port it to asyncio. Thus I took a short look around.

DE: single pages as PNG; AIXM data.

NL: page collections as PDF; no AIXM data.

DK: all pages in one PDF; no AIXM data.

So I'll design my server's AIP arch around the apparent lowest common denominator: one PDF per page, referenced by the page designator (eg GEN-1.1) as URI.

Each country gets its own AIP #Python3 module; AIXM support is optional.

#aviationgeek #aip #aixm

@frank TIL, I thought the publicly available German AIP didn’t have any AIXM.

Looks like they fooled me with their absurd decision to publish as PNG instead of a sensible PDF.

@morre Got you covered; Thanks to html.parser and tesseract I have a collection of OCRed PDF files.

No, I won't release the files due to the DFS disclaimer. But I'll provide tools and a guide later this year.

Sadly we have to geo-reference the visual approach charts ourselves, too. But when done, it's really nice to follow exact traffic circuits using Enroute. 🙂

@frank I do have OCR‘d PDFs too, but it’s just annoying they don’t provide them natively.

And even then, some charts are barely usable.

Look at the ones for EDSB for example, the number 3 chart is very detailed, and at the resolution they provide, the text gets super hard to read

@morre https://aip.dfs.de/basicVFR/print/AD/12FD8E50E4AC59D106A1414EE83A5248/EDSB%20Karlsruhe%20Baden-Baden%203

Imo at 2484x3276 pixels (printable version) it's readable.

Granted, not as good as the layered Dutch PDFs.

And the DFS decided not just to load up image files, they "hide" (?) them as data URIs in the HTML. 🙄

@frank maybe it’s my printer 😆

I didn’t look into the details so far, just found https://github.com/hamarituc/dfs-aip and that works quite well.

GitHub - hamarituc/dfs-aip: Skripte zur Aufbereitung der DFS AIP

Skripte zur Aufbereitung der DFS AIP. Contribute to hamarituc/dfs-aip development by creating an account on GitHub.

GitHub

@morre Ah, that software was written before they released the free VFR version.

All in all there are three page image versions:

1) preview (on a folder page, linking to the document)

2) medium (on a document page)

3) large (when clicking print on a document page)

The page URIs (permalink, print link) are given in a script section of each document page. The images are embedded as data URIs.

tesseract transcribes the images. Add a bit grep, and you get your own German AIP search engine.

@frank then I’m looking forward to the tooling, thanks!

@morre Nice one. Curl gets a 404 from the DFS server when requesting HEAD for a print page.

With a different (arbitrary) User-Agent the request succeeds. No entity tag for print pages. Seem to be generated on the fly.

Btw, the DFS is running a MS IIS/10.0 server with PHP/8.2.10.

@morre I turned it into a single-file standalone program two days ago.

Testing it since then. I guess publishing Angst is kicking in: What if I miss a cardinal bug?

@frank If you miss a bug, it will probably be found at some point and can be fixed.

No one’s health or safety depends on it, so 🤷

@morre My mental health and ego safety depends on it, though. 😉
Frank Abelbeck (@[email protected])

Initial release of my Python-based tool for German AIP page retrieval. Background: Deutsche Flugsicherung publishes the AIP as per-page HTML resource, but provides the pages as PNG pictures, embedded as data URIs. This program retrieves the images and optionally transcribes them with tesseract. Download: https://aac.abelbeck.info/releases/getGermanAIP.py Dependencies: Python standard library (batteries included); tesseract if you want PDFs and/or transcripts. #python3 #aviationnerd

troet.cafe - Mastodon