Il reading the Shopify article about Airflow. I know I'm slow (kids, life).
While I learned a few things, I am still wrapping my head about having 10K dags on the same instance: just loading the web UI should take forever.
Few suggestions from my side:
- run airflow on kubernetes (use the helm chart)
- data processing is done outside Airflow
- if you generate thousand of dags with the same code, probably there is a better solution
- read dags from git, actually put everything on git