Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MySQLからBigQueryの同期を差分更新にしたら4倍高速になった話 / Sync from MySQL to BigQuery become 4x faster by incremental updating

MySQLからBigQueryの同期を差分更新にしたら4倍高速になった話 / Sync from MySQL to BigQuery become 4x faster by incremental updating

Embulk Meetup Tokyo #3のLTです

Takehiro Shiozaki

May 16, 2017
Tweet

More Decks by Takehiro Shiozaki

Other Decks in Technology

Transcript

  1. .Z42-͔Β#JH2VFSZͷಉظΛ
    ࠩ෼ߋ৽ʹͨ͠Βഒߴ଎ʹͳͬͨ࿩
    &NCVML.FFUVQ5PLZP
    VASILY,Inc. Ԙ㟒݈߂

    View Slide

  2. Ԙ㟒݈߂
    w ೥ʹ7"4*-:ʹ৽ଔೖࣾ
    w ීஈͷ͓࢓ࣄ͸3BJMTͰXFC"1*࡞ͬͨΓɺ4PMSͷϝϯςͨ͠Γɺ

    5BCMFBV#JH2VFSZͰμογϡϘʔυ࡞ͬͨΓ
    w &NCVML͸ࠓ೥ͷ݄ࠒʹ৮Γ࢝Ίͨ͹͔Γ
    7"4*-: *OD
    TPGUXBSFFOHJOFFS

    View Slide

  3. *20/঺հ
    Ҏ্ͷϑΝογϣϯ&$αΠτ͔Β
    ྦྷܭ ສ఺Λ௒͑Δ঎඼Λܝࡌ
    ݄ؒສਓҎ্͕ར༻͢Δ೔ຊ࠷େڃͷϑΝογϣϯαΠτ

    View Slide

  4. σʔλऩूج൫ͷ঺հ
    Πϕϯτϩά

    &-#
    "1*4FSWFS
    -PH
    "HHSFHBUPS
    8FC4FSWFS
    #JH2VFSZ
    3FDPNNFOE

    View Slide

  5. σʔλ෼ੳج൫ͷ঺հ
    Ϛελʔσʔλ

    3%4
    .Z42-

    %BUBUSBOTGFS
    TFSWFS
    ($4 #JH2VFSZ
    ෗Γ͔͚͍ͯΔൿ఻ͷͨΕঢ়ଶ
    ɾNZTRMEVNQ US TFE FUDΛෳࡶʹ૊Έ߹Θ͍ͤͯΔ
    ɾςʔϒϧΛ෼ׂͯ͠సૹ͢ΔઃఆΛϋʔυίʔσΟϯά

    View Slide

  6. σʔλ෼ੳج൫ͷ঺հ
    Ϛελʔσʔλ

    3%4
    .Z42-

    %BUBUSBOTGFS
    TFSWFS
    ($4 #JH2VFSZ
    ʹ&NCVMLΛಋೖ
    ɾಉظόον͕མͪΔ͜ͱ͕΄΅ͳ͘ͳΔ

    View Slide

  7. ͔͜͜ΒϝΠϯςʔϚ

    View Slide

  8. ϚελʔσʔλͷಉظΛ
    ߴ଎Խ͍ͨ͠
    w Ϣʔβʔ΁ͷՁ஋ͷఏڙΛߴ଎Խ͢ΔͨΊ
    w ݱঢ়঎඼৘ใͷಉظ͚ͩͰ໿࣌ؒ෼
    ໿ ສߦ
    ໿(#

    View Slide

  9. ࣌ؒܭଌ
    3%4
    .Z42-

    %BUBUSBOTGFS
    TFSWFS
    ($4 #JH2VFSZ
    NJO
    .Z42-ͷςʔϒϧΛμϯϓ
    NJO
    μϯϓΛ($4ʹΞοϓϩʔυ
    NJO
    ($4͔Β#JH2VFSZʹϩʔυ

    View Slide

  10. .Z42-ͷμϯϓ͕
    ϘτϧωοΫ
    w Ϛελʔσʔλͷಉظ͸ຖ೔ߦ͍ͬͯΔͨΊɺ

    ࡢ೔ͱಉ͡σʔλ͸ಉظ͢Δඞཁͳ͠
    w 3VCZPO3BJMTΛ࢖༻͍ͯ͠ΔͷͰɺ

    [email protected]ʹͦͷߦͷߋ৽࣌ࠁ͕ೖ͍ͬͯΔ

    View Slide

  11. ࠩ෼ߋ৽ͯ͠ΈΔ
    ςʔϒϧ
    ࠩ෼ͷΈ

    ςʔϒϧ
    લ೔෼

    ςʔϒϧͷϚʔδ
    0-"1ؔ਺Λ׆༻
    લճͷಉظҎ߱ͷσʔλͷΈ
    where: "updated_at > ..."

    View Slide

  12. લճͷಉظҎ߱ͷσʔλΛऔಘ
    w JOQVUQMVHJONZTRMͷઃఆʹXIFSFΛ௥Ճ͢Δ
    w औಘ͢Δඞཁͷ͋Δߦ͕ҎԼʹͳΔ
    in:
    type: mysql
    host: example.com
    user: user_name
    password: ********
    database: db_name
    table: items
    select: "*"
    where: "updated_at > '2017-05-15 00:00:00'" # この行を追加

    View Slide

  13. 42-ͰςʔϒϧΛϚʔδ
    w ͭͷςʔϒϧΛVOJPOBMMͰ݁߹͠ɺओΩʔͰ
    QBSUJUJPOCZͯ͠৽͍͠ߦͷΈΛબ୒
    w ͜ͷ42-ͷ݁ՌΛݩͷςʔϒϧʹॻ͖໭͢
    select * from (
    select
    *,
    row_number() over (partition by id order by updated_at desc) as rn
    from (
    select * from tmp.items union all select * from mysql.items
    )
    )
    where rn = 1

    View Slide

  14. ݁Ռ
    3%4
    .Z42-

    %BUBUSBOTGFS
    TFSWFS
    ($4 #JH2VFSZ
    NJO
    ˠNJO
    NJO
    ˠNJO
    NJO
    ˠNJO
    શମͰഒͷߴ଎Խ
    ໿INJOˠNJO
    ςʔϒϧͷϚʔδ
    NJOˠNJO

    View Slide

  15. ৽ͨͳ՝୊
    w Ϛελʔσʔλͷಉظ͕ෳࡶʹͳΓͦ͏
    w ֤ॲཧͷʮྲྀΕʯΛ؅ཧ͢Δπʔϧ͕ཉ͍͠
    %JHEBH "JSqPX -VJHJ

    View Slide

  16. ·ͱΊ
    w &NCVML͸όϧΫసૹͷҋΛরΒͯ͘͠ΕΔπʔϧ
    w σʔλͷࠩ෼ߋ৽ʹΑͬͯഒߴ଎Խ
    w ӡ༻ϑΣʔζͰ͸ଞͷπʔϧͱͷ࿈ܞ͕ඞཁ

    View Slide