Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
End-to-end automated data science process using...
Search
Keerthi
October 31, 2018
Education
2
260
End-to-end automated data science process using Airflow.
End-to-end automated data science process using Airflow.
Keerthi
October 31, 2018
Tweet
Share
Other Decks in Education
See All in Education
1202
cbtlibrary
0
190
1008
cbtlibrary
0
120
Semantic Web and Web 3.0 - Lecture 9 - Web Technologies (1019888BNR)
signer
PRO
2
3.1k
Introdución ás redes
irocho
0
510
いわゆる「ふつう」のキャリアを歩んだ人の割合(若者向け)
hysmrk
0
290
Design Guidelines and Models - Lecture 5 - Human-Computer Interaction (1023841ANR)
signer
PRO
0
1.2k
卒論の書き方 / Happy Writing
kaityo256
PRO
54
28k
RGBでも蛍光を!? / RayTracingCamp11
kugimasa
2
310
Web Search and SEO - Lecture 10 - Web Technologies (1019888BNR)
signer
PRO
2
3k
Human Perception and Cognition - Lecture 4 - Human-Computer Interaction (1023841ANR)
signer
PRO
0
1.3k
授業レポート:共感と協調のリーダーシップ(2025年上期)
jibunal
1
190
外国籍エンジニアの挑戦・新卒半年後、気づきと成長の物語
hypebeans
0
690
Featured
See All Featured
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.3k
Believing is Seeing
oripsolob
0
19
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
140
Discover your Explorer Soul
emna__ayadi
2
1k
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
140
Into the Great Unknown - MozCon
thekraken
40
2.2k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
0
1.1k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
287
14k
Transcript
End-to-end automated data science process using Airflow. Evive
About Evive • Data Driven benefit navigator • Founded in
2006 • 400 + employees
Evive Data 15 2.5M 400 Data team Evive Employee Total
Active members
Data Usage 500+GB 50+ 30+ Total data per day Number
of data channels Number of models running daily
Why Airflow THE WORKFLOW Ingestion Merge data from multiple sources
Standardise Verify Publish
Airflow workers Data Sources Scheduler Database
Airflow Architecture
Functionalities • Scheduling • Dependency management • Error recovery •
Monitoring • Versioning • Mailing and alerting
Creating a dag and an operator
Scheduling tasks
File sensor • Operator that listens to a particular directory
and triggers the downstream task once the file lands on the corresponding directory. • Pynotify as operator.
Monitoring using airflow dashboard
Versioning • Versioning can be easily incorporated in airflow as
the entire dag execution happens as one instance. • You can version your data as well as model outputs.
Mailing and alerting system
Future work • Integrating with the existing database architecture and
ETL pipeline • Airflow Kubernetes executors