Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
End-to-end automated data science process using...
Search
Keerthi
October 31, 2018
Education
260
2
Share
End-to-end automated data science process using Airflow.
End-to-end automated data science process using Airflow.
Keerthi
October 31, 2018
Other Decks in Education
See All in Education
Virtual and Augmented Reality - Lecture 8 - Next Generation User Interfaces (4018166FNR)
signer
PRO
0
2.1k
SSH_handshake_easy_explain
kenbo
0
970
Alumnote inc. Company Deck
yukinumata
1
16k
理工学系 第1回大学院説明会2026|東京科学大学(Science Tokyo)
sciencetokyo
PRO
1
1.3k
リモートリポジトリの操作 / 02-c-remote
kaityo256
PRO
0
150
Gluon Recruit Deck
gluon
0
170
アントレプレナーシップ教育機構 概要
sciencetokyo
PRO
0
1.9k
[2026前期火5] 論理学(京都大学文学部 前期 第2回)「論理的な正しさはどこにあるのか」
yatabe
0
530
Interactive Tabletops and Surfaces - Lecture 5 - Next Generation User Interfaces (4018166FNR)
signer
PRO
1
2.1k
演習:Gitの応用操作 / 05-git-advanced
kaityo256
PRO
0
260
LinkedIn
matleenalaakso
0
4k
SL AMIGOS 教育格差と私たちの取り組み - スリランカの支援学校への支援プロジェクト:リシンドゥ リオ 氏 (別府溝部学園短期大学 ビジネス観光コース 留学生):2720 Japan O.K. ロータリーEクラブ2026年4月6日卓話
2720japanoke
0
530
Featured
See All Featured
The SEO Collaboration Effect
kristinabergwall1
0
420
The SEO identity crisis: Don't let AI make you average
varn
0
440
Git: the NoSQL Database
bkeepers
PRO
432
67k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.4k
The browser strikes back
jonoalderson
0
940
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.8k
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
1.1k
Done Done
chrislema
186
16k
The Invisible Side of Design
smashingmag
302
51k
The agentic SEO stack - context over prompts
schlessera
0
740
Optimizing for Happiness
mojombo
378
71k
Become a Pro
speakerdeck
PRO
31
5.9k
Transcript
End-to-end automated data science process using Airflow. Evive
About Evive • Data Driven benefit navigator • Founded in
2006 • 400 + employees
Evive Data 15 2.5M 400 Data team Evive Employee Total
Active members
Data Usage 500+GB 50+ 30+ Total data per day Number
of data channels Number of models running daily
Why Airflow THE WORKFLOW Ingestion Merge data from multiple sources
Standardise Verify Publish
Airflow workers Data Sources Scheduler Database
Airflow Architecture
Functionalities • Scheduling • Dependency management • Error recovery •
Monitoring • Versioning • Mailing and alerting
Creating a dag and an operator
Scheduling tasks
File sensor • Operator that listens to a particular directory
and triggers the downstream task once the file lands on the corresponding directory. • Pynotify as operator.
Monitoring using airflow dashboard
Versioning • Versioning can be easily incorporated in airflow as
the entire dag execution happens as one instance. • You can version your data as well as model outputs.
Mailing and alerting system
Future work • Integrating with the existing database architecture and
ETL pipeline • Airflow Kubernetes executors