Slide 20
Slide 20 text
The Experimentation Reporting Framework
We use Airflow to power the experimentation reporting
framework (ERF).
In 2014, the ERF was a large ruby script doing dynamic SQL
generation from a single YAML file defining metrics, sources
and experiments to render several thousand lines of SQL in a
single script. The hive job would finish, once in a while.
A large refactor effort led to a rewrite in Python, into several
dynamic airflow DAGs. It is our largest DAG by compute time
and possibly by number of tasks (Approximately 8000 per
dag run). Airflow is worth having just for the ability to run
our experiments reliably, and debug when things go wrong.
I would show it to you, but Airflow cannot currently render
the full graph :)
20 / 24