$30 off During Our Annual Pro Sale. View Details »

trinity で Cloud Composer に
ワークフローを簡単デプロイ / Easy workflow deployment to Cloud Composer with trinity

trinity で Cloud Composer に
ワークフローを簡単デプロイ / Easy workflow deployment to Cloud Composer with trinity

2019.10.25 Fukuoka.go#14+Umeda.go
https://fukuokago.connpass.com/event/146447/

Hiroka Zaitsu

October 25, 2019
Tweet

More Decks by Hiroka Zaitsu

Other Decks in Technology

Transcript

  1. ࡒ௡େՆ / Pepabo R&D Institute, GMO Pepabo, Inc.
    2019.10.25 Fukuoka.go#14+Umeda.go
    trinity Ͱ Cloud Composer ʹ

    ϫʔΫϑϩʔΛ؆୯σϓϩΠ

    View Slide

  2. σʔλαΠΤϯςΟετ
    ࡒ௡ େՆ / @zaimy
    2
    Hiroka Zaitsu
    ϖύϘݚڀॴ ݚڀһ

    View Slide

  3. 1. Cloud Composer ͱ͸
    2. Cloud Composer ΁ͷσϓϩΠ࣌ͷࠔΓ͝ͱ
    3. trinity ʹΑΔղܾͷࢼΈ
    4. ࠓޙ΍Δ͜ͱ
    3
    ໨࣍

    View Slide

  4. 1.
    Cloud Composer ͱ͸

    View Slide

  5. • GCP ͷ "ϑϧϚωʔδυͷϫʔΫϑϩʔ ΦʔέετϨʔγϣϯ αʔϏε"
    • Apache Airflow Λ GCP ্ʹߏங͢Δ
    • ϖύϘͷϩάج൫ʢDWHʣΛ Treasure Data ͔Β GCP ΁Ҡߦத
    • ϫʔΫϑϩʔαʔϏε΋ Treasure Workflow (Ϛωʔδυ Digdag) ͔Β
    Cloud Composer ΁Ҡߦத
    5
    Cloud Composer ͷ֓ཁ

    View Slide

  6. ϫʔΫϑϩʔͷίʔυϕʔε
    repository
    └ dags
    ɹ ├ workflowA
    ɹ │ ├ main.py
    ɹ │ └ hoge.sql
    ɹ └ workflowB
    ɹ ɹ ├ main.py
    ɹ ɹ └ piyo.sql
    6
    • dags σΟϨΫτϦ഑ԼʹϫʔΫϑϩʔ୯ҐͰ

    αϒσΟϨΫτϦΛ੾Δ
    • ϫʔΫϑϩʔຊମʢDAGʣͷ python ίʔυ
    • ϫʔΫϑϩʔͰར༻͢ΔΫΤϦ
    • ઃఆϑΝΠϧɹͳͲ

    ※σΟϨΫτϦߏ଄Λ Cloud Storage ͱ߹ΘͤΔ৔߹

    View Slide

  7. ϫʔΫϑϩʔͷσϓϩΠʢ௥Ճͱߋ৽ʣ
    $ gcloud composer environments storage dags import \
    --environment ENVIRONMENT_NAME \
    --location LOCATION \
    --source LOCAL_FILE_TO_UPLOAD
    7
    ίʔυϕʔε
    $MPVE4UPSBHF "JSqPX
    HDMPVEDPNQPTFSJNQPSU

    View Slide

  8. ϫʔΫϑϩʔͷ࡟আ ͦͷ1 - Cloud Storage ͔Β࡟আ
    $ gcloud composer environments storage dags delete \
    --environment ENVIRONMENT_NAME \
    --location LOCATION \
    DAG_NAME.py
    8
    ίʔυϕʔε
    $MPVE4UPSBHF "JSqPX
    HDMPVEDPNQPTFSEFMFUF

    View Slide

  9. ϫʔΫϑϩʔͷ࡟আ ͦͷ2 - Airflow ͔Β࡟আ
    $ gcloud composer environments run --location LOCATION \
    ENVIRONMENT_NAME delete_dag -- DAG_NAME
    9
    ίʔυϕʔε
    $MPVE4UPSBHF "JSqPX
    HDMPVEDPNQPTFSEFMFUF@EBH

    View Slide

  10. 2.
    Cloud Composer ΁ͷ

    σϓϩΠ࣌ͷࠔΓ͝ͱ

    View Slide

  11. • ϫʔΫϑϩʔͷ௥Ճͱߋ৽
    • import ͸ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ
    • ࠩ෼ͷ͋ΔϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ
    • import ͸ Cloud Storage ͷϑΝΠϧΛ্ॻ͖͢Δ
    • ίʔυϕʔεͰ࡟আͨ͠ϑΝΠϧ͸

    ݸผʹ࡟আ͠ͳ͍ݶΓ Cloud Storage ʹ࢒Δ
    11
    gcloud ίϚϯυΛͦͷ··ӡ༻ʹ࢖͏ͱେม

    View Slide

  12. • ϫʔΫϑϩʔͷ࡟আ
    • delete ͱ Airflow ͷ dag_delete ͷ2ճίϚϯυΛ࣮ߦ͢Δඞཁ͕͋Δ
    • delete ͸ϑΝΠϧ୯Ґ, dag_delete ͸ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ
    • ࠩ෼ͷ͋ΔϑΝΠϧ/ϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ
    • ։ൃʹΑΓ਺ेݸͷϫʔΫϑϩʔʹ೔ʑࠩ෼͕ੜ·Ε͍ͯ͘
    • ࠩ෼Λػցతʹݕग़ͯ͠ Cloud Composer ʹಉظ͍ͨ͠
    12
    gcloud ίϚϯυΛͦͷ··ӡ༻ʹ࢖͏ͱେม

    View Slide

  13. • όέοτ/σΟϨΫτϦؒͰϑΝΠϧΛಉظ͢Δ Cloud Storage ͷίϚϯυ
    • ϑΝΠϧͷߋ৽࣌ࠁʹࠩҟ͕͋Ε͹ಉظର৅ͱ൑ఆ͞ΕΔ
    • ಺༰͕มߋ͞Ε͍ͯͳͯ͘΋ॲཧର৅ʹͳͬͯ͠·͏
    • Cloud Storage ʹґଘ͢Δ
    • Airflow ͸ GCP Ҏ֎Ͱ΋ߏஙͰ͖ΔͷͰଞͷετϨʔδʹ΋ରԠ͍ͨ͠
    13
    gsutil rsync ͸Ͳ͏͔ͳ

    View Slide

  14. • ಛఆͷ git ϦϙδτϦͱಉظ͢Δ Airflow ͷػೳ
    • ୯ҰͷϒϥϯνͷΈࢦఆՄೳ
    • ຊ൪؀ڥʹ master ͷίʔυΛಉظ͢Δʹ͸ྑͦ͞͏
    • ςετ؀ڥ΍ CI Ͱ͸ feature branch ͷίʔυΛσϓϩΠ͍ͨ͠
    14
    Airflow sync ͸Ͳ͏͔ͳ

    View Slide

  15. 3.
    trinity ʹΑΔղܾͷࢼΈ

    View Slide

  16. • ίʔυϕʔεͱ Cloud Storage ͱ Airflow ͷ3ͭΛಉظ͢Δ
    • ϫʔΫϑϩʔ୯ҐͰɺσΟϨΫτϦߏ଄ͱϑΝΠϧ಺༰͔Βϋογϡ஋Λܭࢉ
    • ͋Δ࣌఺ͷϫʔΫϑϩʔఆٛΛද͢ϋογϡ஋
    • ίʔυϕʔε͔Βܭࢉͨ͠ϋογϡ஋ͱ Cloud Storage ʹอଘ͞Ε͍ͯΔ

    ϋογϡ஋͕ҟͳΔϫʔΫϑϩʔΛಉظૢ࡞ͷର৅ʹ͢Δ
    16
    trinity ͷํ਑

    View Slide

  17. • https://github.com/zaimy/trinity
    • A tool to synchronize workflows between Codebase, Cloud Storage and Airflow metadata.
    • ͳͥ Goʁ
    • ΫϩείϯύΠϧͰ Mac, Linux, Windows ʹରԠͰ͖Δ
    • ϫʔΫϑϩʔ୯ҐͰॲཧ͕ՄೳͳͷͰฒྻԽ͍ͨ͠
    17
    trinity ͷ࣮૷
    $ trinity --bucket=BUCKET_NAME \
    --composer-env=COMPOSER_ENV_NAME

    View Slide

  18. 1. ίʔυϕʔεͰϋογϡ஋Λܭࢉͯ͠ϫʔΫϑϩʔ͝ͱʹอଘ
    2. ίʔυϕʔεͱ Cloud Storage ͷϫʔΫϑϩʔΛϦετͯ͠ൺֱ
    i. ίʔυϕʔεʹ͔͠ͳ͚Ε͹ Cloud Storage ʹΞοϓϩʔυʢ௥Ճʣ
    ii. Cloud Storage ʹ͔͠ͳ͚Ε͹ Cloud Storage ͱ Airflow ͔Β࡟আ
    iii. ྆ํʹ͋Ε͹ίʔυϕʔεͱ Cloud Storage ͷϋογϡ஋Λൺֱ
    a. ࠩҟ͕͋Ε͹ Cloud Storage ͷϫʔΫϑϩʔΛஔ׵ʢߋ৽ʣ
    18
    ॲཧͷྲྀΕ

    View Slide

  19. ؆୯ʹಉظతͳσϓϩΠ͕

    Ͱ͖ΔΑ͏ʹͳͬͨ !

    View Slide

  20. • ςετ௥ՃͱϦϑΝΫλϦϯά
    • Go ͷ࡞๏΍ߟ͑ํʹԊ͍͖͍ͬͯͨ
    • ػೳ௥Ճ
    • Airflow ʹ͸ dags Ҏ֎ʹ plugins ΋͋ΔͷͰରԠ͢Δ
    • dry-run
    20
    ࠓޙ΍Δ͜ͱ

    View Slide

  21. View Slide