Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
End-to-end automated data science process using...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Keerthi
October 31, 2018
Education
2
260
End-to-end automated data science process using Airflow.
End-to-end automated data science process using Airflow.
Keerthi
October 31, 2018
Tweet
Share
Other Decks in Education
See All in Education
AWS re_Invent に全力で参加したくて筋トレを頑張っている話
amarelo_n24
2
120
令和エンジニアの学習法 〜 生成AIを使って挫折を回避する 〜
moriga_yuduru
0
240
国際卓越研究大学計画|Science Tokyo(東京科学大学)
sciencetokyo
PRO
0
47k
1111
cbtlibrary
0
270
160人の中高生にAI・技術体験の講師をしてみた話
shuntatoda
1
300
NUTMEG紹介スライド
mugiiicha
0
920
学習指導要領と解説に基づく学習内容の構造化の試み / Course of study Commentary LOD JAET 2025
masao
0
120
1125
cbtlibrary
0
170
滑空スポーツ講習会2025(実技講習)EMFT講習 実施要領/JSA EMFT 2025 procedure
jsaseminar
0
110
Introduction - Lecture 1 - Information Visualisation (4019538FNR)
signer
PRO
0
5.1k
0203
cbtlibrary
0
110
The Next Big Step Toward Nuclear Disarmament
hide2kano
0
220
Featured
See All Featured
Six Lessons from altMBA
skipperchong
29
4.2k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
0
150
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
Building the Perfect Custom Keyboard
takai
2
690
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
1
300
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.9k
Technical Leadership for Architectural Decision Making
baasie
2
250
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
Producing Creativity
orderedlist
PRO
348
40k
Transcript
End-to-end automated data science process using Airflow. Evive
About Evive • Data Driven benefit navigator • Founded in
2006 • 400 + employees
Evive Data 15 2.5M 400 Data team Evive Employee Total
Active members
Data Usage 500+GB 50+ 30+ Total data per day Number
of data channels Number of models running daily
Why Airflow THE WORKFLOW Ingestion Merge data from multiple sources
Standardise Verify Publish
Airflow workers Data Sources Scheduler Database
Airflow Architecture
Functionalities • Scheduling • Dependency management • Error recovery •
Monitoring • Versioning • Mailing and alerting
Creating a dag and an operator
Scheduling tasks
File sensor • Operator that listens to a particular directory
and triggers the downstream task once the file lands on the corresponding directory. • Pynotify as operator.
Monitoring using airflow dashboard
Versioning • Versioning can be easily incorporated in airflow as
the entire dag execution happens as one instance. • You can version your data as well as model outputs.
Mailing and alerting system
Future work • Integrating with the existing database architecture and
ETL pipeline • Airflow Kubernetes executors