Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Unlocking the Future of Data Pipeline

Avatar for Lee Wei Lee Wei
September 04, 2025
68

Unlocking the Future of Data Pipeline

Avatar for Lee Wei

Lee Wei

September 04, 2025
Tweet

Transcript

  1. wei-lee.me What to expect? • Intro to Airflow • What's

    new in Airflow 3 • Migrate to Airflow 3
  2. wei-lee.me What are Operators Examples • EmailOperator • HttpOperator •

    SQLExecuteQueryOperator • DockerOperator • HiveOperator • S3FileTransformOperator • PrestoToMySqlOperator • SlackAPIOperator • and etc.
  3. wei-lee.me What are Operators Lots and lots of operators "JSCZUF

    "MJCBCB "NB[PO "QQSJTF "TBOB "SBOHP%# "QBDIF4QBSL "QBDIF1JOPU "QBDIF1JH "QBDIF-JWZ "QBDIF,ZMJO "QBDIF,BGLB "QBDIF)JWF "QBDIF)%'4 "QBDIF'MJOL "QBDIF%SVJE "QBDIF%SJMM "QBDIF$BTTBOESB "QBDIF#FBN %PDLFS %JTDPSE %JOHEJOH ECU %BUBEPH %BUBCSJDLT $PNNPO42- $PIFSF ,VCFSOFUFT $FMFSZ +JSB *#.$MPVEBOU )551 )BTIJDPSQ H31$ (PPHMF 'BDFCPPL '51 'BDFCPPL &YBTPM &MBTUJDTFBSDI 0QFO-JOFBHF 0QFO"* 0QFO'BB4 0%#$ /FPK .Z42- .POHP%# 8JO3. .442- 1431 .JDSPTPGU1PXFS4IFMM .JDSPTPGU"[VSF +FOLJOT +%#$ *."1 *OGMVY%# 1BQFSNJMM 1BHFSEVUZ 0SBDMF 0QTHFOJF 0QFO4FBSDI 4FHNFOU 4BNCB 4BMFTGPSDF 3FEJT 1SFTUP 1PTUHSF42- 1JOFDPOF 1H7FDUPS 5BCMFBV 5BCVMBS 44) 42-JUF 4OPXGMBLF 4.51 4MBDL 4JOHVMBSJUZ 4'51 4FOEHSJE 7FSUJDB 5SJOP 5FMFHSBN ;FOEFTL :BOEFY 8FBWJBUF 9
  4. wei-lee.me Recap • Airflow → programatically orchestrating workflows • Dag

    == Airflow Workflows → A bunch of tasks • Operator → template for a task
  5. wei-lee.me Airflow Improvment Proposals • UI (AIP-38, AIP-79, AIP-84) •

    Data Awareness (AIP-73, AIP-74, AIP-75) • External Event Driven (AIP-82) • Dag Versioning (AIP-63, AIP-64, AIP-65, AIP-66) • Task SDK (AIP-72) • Out of scope today (AIP-69, AIP-78, AIP-81, AIP-83)
  6. wei-lee.me User Interface • AIP-38: Modern Web Application • AIP-79:

    Remove Flask AppBuilder as Core dependency • AIP-84: UI REST API
  7. wei-lee.me Under the hook • Rewrite the frontend using React

    • Remove Flask AppBuilder from core • Redesign API endpoints using FastAPI (Get rid of the 6,000 lines views.py)
  8. wei-lee.me Data Awareness • AIP-73 Expanded Data Awareness • AIP-74

    Introducing Data Assets • AIP-75 New Asset-Centric Syntax • AIP-76 Asset Partitions (TODO)
  9. wei-lee.me What if we have... • Team A: focus only

    on training the model • Team B: using the models to generate a report • These 2 teams don't share code with each other
  10. wei-lee.me What if the 2 teams don't have access to

    the same metacase (can't access the same asset)
  11. wei-lee.me Dag Versioning • AIP-63: Dag Versioning • AIP-64: Keep

    TaskInstance try history • AIP-65: Improve Dag history in UI • AIP-66: Dag Bundles & Parsing
  12. wei-lee.me Ruff AIR Rules Best practices • AIR001 Task Variable

    name should match task_id • AIR002 A Dag should have an explicit schedule argument
  13. wei-lee.me Ruff AIR Rules Required changes for migrating to Airflow

    3 • AIR301 Something removed in Airflow 3 • AIR302 Something moved from Ariflow core to Airflow provider in Airflow 3
  14. wei-lee.me Ruff AIR Rules Suggested changes for migrating to Airflow

    3 • AIR311 Similiar to AIR301, but it won't break your code for the time being (We added a backward-compatible layer) • AIR312 Similiar to AIR302, but it won't break your code for the time being
  15. wei-lee.me What's new in Airflow 3.1! • i18n (Taiwanese Mandarin

    support!) • Human-in-the-loop (AIP-90) • go-sdk (part of AIP-72) • UI-plugins (AIP-68) • etc.
  16. wei-lee.me Internationalization (i18n) 1.English 2.Arabic 3.Catalan 4.German 5.Spanish 6.French 7.Hebrew

    8.Hindi 9.Hungarian 10.Korean 11.Dutch 12.Polish 13.Taiwanese Mandarin 14.Turkish
  17. wei-lee.me __name__ = 李唯 / Wei Lee __what_i_am_doing__ = [

    Everywhere @ PyCon Taiwan, Member @ Python Asia Organization, Meme Bot @ OpenSource4You, Mentor @ OpenSource4You, Maintainer @ commitizen-tools, Committer @ Apache Airflow, Software Engineer @ Astronomer, ] __github__ = Lee-W __linkedin__ = clleew __site__ = https://wei-lee.me $ cat weilee.py
  18. wei-lee.me File "weilee.py", line 1 __name__ = 李唯 / Wei

    Lee ^^^ SyntaxError: invalid syntax $ 𝜋 thon weilee.py
  19. wei-lee.me References • MIB • Fate/Zero • BanG Dream! It's

    MyGO!!!!! • JOJO 的奇妙冒險 • JOJO 的奇妙冒險 黃 金 之風 • 地。-關於地球的運動