Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
End-to-end automated data science process using...
Search
Keerthi
October 31, 2018
Education
2
260
End-to-end automated data science process using Airflow.
End-to-end automated data science process using Airflow.
Keerthi
October 31, 2018
Tweet
Share
Other Decks in Education
See All in Education
IKIGAI World Fes:program
tsutsumi
1
2.6k
Software
irocho
0
640
仏教の源流からの奈良県中南和_奈良まほろば館‗飛鳥・藤原DAO/asuka-fujiwara_Saraswati
tkimura12
0
160
Library Prefects 2025-2026
cbtlibrary
0
140
吉岡研究室紹介(2025年度)
kentaroy47
0
740
Портфолио - Шынар Ауелбекова
shynar
0
140
授業レポート:共感と協調のリーダーシップ(2025年上期)
jibunal
1
160
Semantic Web and Web 3.0 - Lecture 9 - Web Technologies (1019888BNR)
signer
PRO
2
3.1k
20251119 如果是勇者欣美爾的話, 他會怎麼做? 東海資工
pichuang
0
130
Evaluation Methods - Lecture 6 - Human-Computer Interaction (1023841ANR)
signer
PRO
0
1.2k
JavaScript - Lecture 6 - Web Technologies (1019888BNR)
signer
PRO
0
3.1k
RGBでも蛍光を!? / RayTracingCamp11
kugimasa
1
240
Featured
See All Featured
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Building Adaptive Systems
keathley
44
2.9k
Producing Creativity
orderedlist
PRO
348
40k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Speed Design
sergeychernyshev
33
1.4k
Faster Mobile Websites
deanohume
310
31k
Raft: Consensus for Rubyists
vanstee
141
7.2k
KATA
mclloyd
PRO
32
15k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
The Cult of Friendly URLs
andyhume
79
6.7k
The Invisible Side of Design
smashingmag
302
51k
Transcript
End-to-end automated data science process using Airflow. Evive
About Evive • Data Driven benefit navigator • Founded in
2006 • 400 + employees
Evive Data 15 2.5M 400 Data team Evive Employee Total
Active members
Data Usage 500+GB 50+ 30+ Total data per day Number
of data channels Number of models running daily
Why Airflow THE WORKFLOW Ingestion Merge data from multiple sources
Standardise Verify Publish
Airflow workers Data Sources Scheduler Database
Airflow Architecture
Functionalities • Scheduling • Dependency management • Error recovery •
Monitoring • Versioning • Mailing and alerting
Creating a dag and an operator
Scheduling tasks
File sensor • Operator that listens to a particular directory
and triggers the downstream task once the file lands on the corresponding directory. • Pynotify as operator.
Monitoring using airflow dashboard
Versioning • Versioning can be easily incorporated in airflow as
the entire dag execution happens as one instance. • You can version your data as well as model outputs.
Mailing and alerting system
Future work • Integrating with the existing database architecture and
ETL pipeline • Airflow Kubernetes executors