$30 off During Our Annual Pro Sale. View Details »

Google BigQuery の話 #yapcasia

Naoya Ito
August 30, 2014

Google BigQuery の話 #yapcasia

Naoya Ito

August 30, 2014
Tweet

More Decks by Naoya Ito

Other Decks in Technology

Transcript

  1. (PPHMF#JH2VFSZͷ࿩
    /BPZB*UP
    ,"*;&/QMBUGPSN*OD

    :"1$"TJB5PLZP
    ˞(PPHMFࣾͷճ͠
    ऀͰ͸͋Γ·ͤΜ

    View Slide

  2. ΞδΣϯμ
    •  #JH2VFSZ֓؍
    •  #JH2VFSZͷ಺෦
    •  ,"*;&/QMBUGPSN*ODͰͷ࢖͍Ͳ͜Ζ

    View Slide

  3. #JH2VFSZ֓؍

    View Slide

  4. (PPHMF#JH2VFSZ

    View Slide

  5. View Slide

  6. #JH2VFSZͱ͸
    •  ڊେͳσʔλ΁ͷ42- ͳͲ

    Λ਺ඵͰ࣮ߦ͢ΔΫϥ΢υαʔϏε
    –  ԯϨίʔυΛඵ ˞

    –  8FCΠϯλʔϑΣʔε͓Αͼ3&45"1*
    •  (PPHMFࣾ಺Ͱ࢖ΘΕ͖ͯͨ%SFNFMΛαʔϏεԽ
    –  ೥݄$MPTFEϦϦʔε
    –  ೥݄Ұൠެ։
    –  ܧଓతʹόʔδϣϯΞοϓ
    –  ೥݄#JH2VFSZ4USFBNJOH
    ˞(PPHMFͷދͷࢠʮ#JH2VFSZʯΛ'MVFOUEϢʔβʔ͕࢖Θͳ͍ཧ༝͕ͳ͘ͳͬͨཧ༝
    IUUQRJJUBDPNLB[VOPSJJUFNTBDBDCCBBBG

    View Slide

  7. ͲΜͳ͜ͱʹ࢖ΘΕΔ͔
    •  Ϣʔεέʔε
    – ϩάղੳ
    – %BUBXBSF)PVTF
    – 
    •  ޲͍ͯͳ͍༻్
    – ۀ຿%#
    ଎͍3%#.4Ͱ͸
    ͳ͍Αɺͱ͍͏͜ͱ

    View Slide

  8. #JH2VFSZ͸ͳͥ଎͍͔
    •  جຊɺϑϧεΩϟϯͰ͕Μ͹Δ
    –  3%#.4ͷ#5SFFΠϯσοΫεͱ͔ͳ͍
    •  42-Λ෼ࢄॲཧ
    –  .11 .BTTJWFMZ1BSBMMFM1SPDFTTJOH
    2VFSZ&OHJOF
    %SFNFM
    •  ਺ઍ୆ͷσΟεΫͱߴ଎ωοτϫʔΫͰεέʔϧΞ΢τ
    –  5#ͷσʔλΛඵͰϦʔυ͢Δ*0

    View Slide

  9. ͨͩ͠
    •  ଎͍3%#.4Ͱ͸ͳ͍
    •  େਓ਺ͰҰ౓ʹ࢖͏΋ͷͰ͸ͳ͍
    – ओʹόονॲཧʹ࢖͏
    •  εΩʔϚϨεͰ͸ͳ͍
    5#ن໛σʔλͰ΋ઢܗҎ
    ԼͰεέʔϧ͢Δ͕ɺٯ
    ʹখ͞ͳσʔλͰ΋਺ඵ
    ͷΦʔόʔϔου͕͋Δ
    ͷͰ

    View Slide

  10. BigQuery読書会、@harukasan  資料より引用

    View Slide

  11. ଞͷྨࣅ࣮૷ͱͷϙδγϣχϯά
    •  -BSHF#BUDI
    –  ҆ఆͯ͠ڊେͳόονΛ࣮ߦͰ͖Δ
    –  ΫΤϦ࣮ߦ࣌ͷΦʔόʔϔου͕େ͖͍ ਺ेඵʙ਺े෼

    –  .BQ3FEVDFɺ)BEPPQ)JWF

    •  4IPSU#BUDI
    –  ΫΤϦ࣮ߦ࣌ͷΦʔόʔϔου͕਺ඵ
    –  ΞυϗοΫΫΤϦʹ޲͍͍ͯΔ
    –  .112VFSZ&OHJOF1SFTUPɺ*NQBMBɺ#JH2VFSZ %SFNFM


    •  4USFBN1SPDFTTJOH
    –  όον͸࣮ߦͰ͖ͳ͍͕ετϦʔϜʹରͯ͠ϦΞϧλΠϜॲཧͰ͖Δ
    –  /PSJLSBɺ"QBDIF,BGLBɺ5XJUUFS4UPSNFUD
    "NB[PO3FETIJGU΋
    4IPSU#BUDI ৄ͘͠
    ͳ͍ͷͰলུ

    cf.  Batch  processing  and  Stream  processing  by  SQL  
    h;p://www.slideshare.net/tagomoris/hcj2014-­‐sql

    View Slide

  12. Ձ֨
    •  ྉۚ
    – σʔλอ؅(#݄
    – ΫΤϦ5# εΩϟϯͨ͠σʔλͷαΠ
    ζ


    "NB[PO4ΑΓ࣮
    ͸͍҆
    ($1ؔ܎ͷษڧձ͍͘ͱ

    ͷແྉνέοτ΋Β
    ͑ͨΓ͠·͢

    View Slide

  13. #JH2VFSZͷ಺෦ ͚ͩ͢͜͠


    View Slide

  14. (PPHMF#JH%BUB4UBDL
    •  ʰ(PPHMFΛࢧ͑Δٕज़ʱ
    –  #JH%BUB4UBDL
    –  ('4ɺ#JH5BCMFɺ.BQ3FEVDFFUD
    •  #JH%BUB4UBDL
    –  #JH%BUB4UBDLͷ্ʹߏங͞Εͨɺͷ՝୊Λղফ͢Δ࣮૷܈
    –  $PMPTTVT .FHBTUPSF 4QBOOFS 'MVNF+BWB %SFNFM
    طʹ(PPHMFࣾ಺͸
    #JH%BUB4UBDLͩ
    ͱ͔͍͏࿩΋ͪΒ΄Β

    View Slide

  15. #JH2VFSZͷٕज़ελοΫ
    (PPHMF'JMF4ZTUFN ('4

    $PMPTTVT'JMF4ZTUFN $'4

    $PMVNO*0
    %SFNFM
    ෼ࢄ'4
    ('4ͷվྑܕ'4
    ৄࡉ͸ඇެ։

    #JH2VFSZͷͨΊͷྻࢦ޲ϑΝΠϧ
    ϑΥʔϚοτ
    ฒྻ42-࣮ߦΤϯδϯ
    σʔληϯλʔΛ·͍ͨͰ
    ෼ࢄ͞ΕͯΔσʔλΛฒྻ
    ͔ͭߴ଎ʹऔಘͰ͖ΔΒ͠
    ͍

    View Slide

  16. $PMVNO*0
    Dremel:  InteracIve  Analysis  of  Web-­‐Scale  Datasets  
    h;p://research.google.com/pubs/archive/36632.pdf
    ߦͰ͸ͳ͘ྻ୯ҐͰɻಛ
    ఆྻΛγʔέϯγϟϧʹ
    ಡΊΔ׌ͭ$PMPTTVT
    ͰฒྻಡΈࠐΈ

    View Slide

  17. %SFNFM
    Dremel:  InteracIve  Analysis  of  Web-­‐Scale  Datasets  
    h;p://research.google.com/pubs/archive/36632.pdf

    View Slide

  18. Root  Mixer
    Mixer  1  
    Shard  0-­‐8
    Mixer  1  
    Shard  9-­‐16
    Mixer  1  
    Shard  17-­‐24
    Shard  0 Shard  10 Shard  12 Shard  20 Shard  24
    Distributed  Storage  (e.g.,  CFS)
    Dremel  serving  tree
    Google  BigQuery  AnalyIcs  P.284  
    Chapter  9  Understanding  Query  ExecuIon
    ෼ࢄ

    View Slide

  19. Root  Mixer
    Mixer  1  
    Shard  0-­‐8
    Mixer  1  
    Shard  9-­‐16
    Mixer  1  
    Shard  17-­‐24
    Shard  0 Shard  10 Shard  12 Shard  20 Shard  24
    Distributed  Storage  (e.g.,  CFS)
    Dremel  serving  tree
    Google  BigQuery  AnalyIcs  P.284  
    Chapter  9  Understanding  Query  ExecuIon
    $'4$PMVNO*0Ͱಛ
    ఆྻͷσʔλ͕Ұ෦ฦͬ
    ͯ͘Δ
    ෼ࢄ ू໿

    View Slide

  20. Root  Mixer
    Mixer  1  
    Shard  0-­‐8
    Mixer  1  
    Shard  9-­‐16
    Mixer  1  
    Shard  17-­‐24
    Shard  0 Shard  10 Shard  12 Shard  20 Shard  24
    Distributed  Storage  (e.g.,  CFS)
    Dremel  serving  tree
    Google  BigQuery  AnalyIcs  P.284  
    Chapter  9  Understanding  Query  ExecuIon
    $'4$PMVNO*0Ͱಛ
    ఆྻͷσʔλ͕Ұ෦ฦͬ
    ͯ͘Δ
    ྻΛॱ൪ʹಡΈߦ
    Λऔಘɻ8)&3&۟ͳ
    ͲΛݟͯඞཁͳߦͷΈ
    ʹߜΓϝϞϦͰอ࣋
    ෼ࢄ ू໿

    View Slide

  21. Root  Mixer
    Mixer  1  
    Shard  0-­‐8
    Mixer  1  
    Shard  9-­‐16
    Mixer  1  
    Shard  17-­‐24
    Shard  0 Shard  10 Shard  12 Shard  20 Shard  24
    Distributed  Storage  (e.g.,  CFS)
    Dremel  serving  tree
    Google  BigQuery  AnalyIcs  P.284  
    Chapter  9  Understanding  Query  ExecuIon
    $'4$PMVNO*0Ͱಛ
    ఆྻͷσʔλ͕Ұ෦ฦͬ
    ͯ͘Δ
    ྻΛॱ൪ʹಡΈߦ
    Λऔಘɻ8)&3&۟ͳ
    ͲΛݟͯඞཁͳߦͷΈ
    ʹߜΓϝϞϦͰอ࣋
    ֤TIBSE͔ΒσʔλΛू
    ໿ɻྫ͑͹ιʔτ΍-*.*5
    ͷߜΓࠐΈͳͲ͢Δ
    ෼ࢄ ू໿

    View Slide

  22. Root  Mixer
    Mixer  1  
    Shard  0-­‐8
    Mixer  1  
    Shard  9-­‐16
    Mixer  1  
    Shard  17-­‐24
    Shard  0 Shard  10 Shard  12 Shard  20 Shard  24
    Distributed  Storage  (e.g.,  CFS)
    Dremel  serving  tree
    Google  BigQuery  AnalyIcs  P.284  
    Chapter  9  Understanding  Query  ExecuIon
    $'4$PMVNO*0Ͱಛ
    ఆྻͷσʔλ͕Ұ෦ฦͬ
    ͯ͘Δ
    ྻΛॱ൪ʹಡΈߦ
    Λऔಘɻ8)&3&۟ͳ
    ͲΛݟͯඞཁͳߦͷΈ
    ʹߜΓϝϞϦͰอ࣋
    ֤TIBSE͔ΒσʔλΛू
    ໿ɻྫ͑͹ιʔτ΍-*.*5
    ͷߜΓࠐΈͳͲ͢Δ
    ू໿ͨ݁͠Ռ
    ΛDBMMFSʹฦ͢
    ෼ࢄ ू໿

    View Slide

  23. #JH2VFSZͷ͍͢͝ॴ
    •  ΧϥϜܕ*0ɺ42-ͷ෼ׂ౷࣏
    – Ͱ΋͜Εɺ.11తʹ͸௝͘͠ͳ͍
    •  ͡Ό͋ɺ#JH2VFSZͷԿ͕͍͔͢͝
    – (PPHMFͷ௒Ͱ͔͍Πϯϑϥ

    View Slide

  24. ׂͱ਎΋֖΋ͳ͍ŋŋŋ

    View Slide

  25. ͜ΜͳΫιΫΤϦͰ΋ඵɺ̐ඵͩ

    View Slide

  26. ,"*;&/QMBUGPSN*OD
    Ͱͷ࢖͍Ͳ͜Ζ

    View Slide

  27. Ϣʔεέʔε
    •  ΞΫηεϩάͷอଘௐࠪ
    •  ΞϓϦέʔγϣϯϩάͷղੳ %BUBXBSF
    )PVTF

    •  "#ςετͷ༗ҙࠩ൑ఆ

    View Slide

  28. ΞΫηεϩά

    View Slide

  29. ΞΫηεϩά#JH2VFSZ
    •  /HJOYͷϩάΛqVFOUQMVHJOCJHRVFSZͰ
    ૹΓଓ͚Δ
    –  &&Ͱ҉߸Խ͞ΕͯΔΑ
    •  Կ͔༻͕͋ͬͨΒ42-Ͱղੳ
    –  %BJMZ8FFLMZ.POUIMZ17
    –  ϓϩμΫγϣϯͷσόοά

    View Slide

  30. qVFOUQMVHJOCJHRVFSZ
    •  CZUBHPNPSJT͞ΜɺZVHVJ͞Μଞ
    •  ઌ೔͔Β,"*;&/QMBUGPSN*OD͕ϝ
    ϯςφʹ
    – ࣮࣭ɺԶ
    QBUDIFTXFMDPNF
    Ͱ͢

    View Slide

  31. ΞϓϦέʔγϣϯͷϩάղੳ

    View Slide

  32. ϩάΛඈ͹͢
    •  3BJMT͔ΒUEMPHHFSSVCZͰqVFOUE΁
    •  qVFOUEQMVHJOCJHRVFSZͰ#2ʹඈ͹͢

    View Slide

  33. ϩάΛඈ͹͢ܖػ
    •  ϦΫΤετຖ
    –  "QQMJDBUJPO$POUSPMMFS
    –  ϩάΠϯϢʔβͷଐੑΛඈ͹͢ˠ%"6΍."6ͷ
    ࢉग़ʹ
    •  Ϟσϧͷঢ়ଶมߋ࣌
    –  "DUJWF3FDPSE0CTFSWFS
    –  Ϟσϧຖʹద౰ͳଐੑΛݟસͬͯඈ͹͢
    –  #JH2VFSZ͸ෳࡶͳ42-Ͱ΋ී௨ʹԠ౴͢Δ㱺ϓ
    ϩμΫτϚωʔδϟ͕ؾܰʹ42-ॻ͍ͯΔ

    View Slide

  34. ਖ਼نԽ͸͋·Γ͠ͳ͍
    •  ελʔεΩʔϚ
    – %8)ͷఆ൪ͷϞσϦϯά
    •  ϑΝΫτςʔϒϧŋŋŋϩά
    •  ࣍ݩςʔϒϧŋŋŋϚελʔσʔλ ސ٬໊ͱ͔

    – ਖ਼نԽ͠ͳ͍ͷ͕ηΦϦʔ

    View Slide

  35. "#ςετ༗ҙࠩ൑ఆ
    •  "#ςετͷαʔϏεͳͷͰ͆
    •  ৄࡉ͸ൿີ
    •  SFRTFDͱ͔qVUFOEͰૹͬͯΔ
    ͚Ͳ΁ͬͪΌΒ͞
    –  ˞SFRTFDͷ)551SFRVFTUqVFOUE͕όοϑΝϦϯά͢ΔͷͰ
    #JH2VFSZͷ"1*ίʔϧ਺͸ͣͬͱগͳ͍

    View Slide

  36. ֎෦πʔϧͱͷ઀ଓ
    •  ΤΫηϧ
    –  #JH2VFSZ$POOFDUPSGPS&YDFMCZ(PPHMF
    –  ϐϘοτ෼ੳʹ
    •  %0.0 #*

    –  FYQFSJNFOUBMͳ#JH2VFSZΠϯλϑΣʔε
    ͋ͬͨ
    –  5BCMFBV౳ϝδϟʔͲ͜Ζ΋ରԠ࢝͠ΊͯΔ

    View Slide

  37. ໘౗ͳͱ͜Ζ
    •  qVFOUEQMVHJOCJHRVFSZ͕εΩʔϚϑΝΠϧΛཁٻ
    ͢Δ
    –  ͕͔ͩ͠͠IBLPCFSB͞Μ͕QBUDIΛॻ͍ͯ͘Εͨ
    –  W͔ΒGFUDI@TDIFNBػೳ͕࢖͑ΔΑ
    •  ࣍ݩςʔϒϧͷߋ৽
    –  61%"5&Ͱ͖ͳ͍ͷͰ
    –  ໷ؒͱ͔ʹҰճফͯ͠࡞ΔɺΈ͍ͨͳ
    –  1SFTUPΈ͍ͨʹҧ͏σʔλιʔεΛ+0*/Ͱ͖ͨΓ͢Δͱخ
    ͍͠ͷ͕ͩŋŋŋ

    View Slide

  38. ࢛ํࢁ
    •  (PPHMF"OBMZUJDT#JH2VFSZศརͦ͏
    –  ("ͷੜϩάΛ#JH2VFSZͰղੳͰ͖ΔΦϓγϣϯ
    –  ͨͩ͠("ͷ༗ྉαʔϏε
    •  Ͱ͔͍σʔλͷΠϯϙʔτ
    –  (PPHMF%BUB4UPSFʹஔ͍͔ͯΒΠϯϙʔτ͢Δͱߴ଎
    •  5BCMF%FDPSBUPST
    –  σʔλͷ࣌ؒൣғΛࢦఆͯ͠ΫΤϦɻεΩϟϯର৅ͷσʔλ͕খ͘͞ͳ
    ΔͷͰΫΤϦඅ༻Λઅ໿Ͱ͖Δ
    •  +0*/੍ݶ.#͸ੲͷ࿩
    –  +0*/&"$)Λ࢖͏ͱ.BQ3FEVDFͷTIV⒐FΈ͍ͨͳॲཧͰڊ
    େͳ+0*/ ԯYԯͱ͔ŋŋŋ
    ͯ͘͠ΕΔΑ

    View Slide

  39. ·ͱΊ
    •  #JH2VFSZ͸ϑϧεΩϟϯͰͰ͔͍σʔλͷ
    42-͕਺ඵͳαʔϏε
    •  ΫιΫΤϦ΋ྗۀͰॲཧͪ͠Ό͏ΧοίΠΠ
    •  ෼ׂ౷࣏(PPHMFͷ%$ن໛Ͱ਎΋֖΋ͳ͍
    ฒྻॲཧܥ
    •  όονɺϩάղੳͳΜ͔ʹ࢖͑·͢
    •  ࢲ͸(PPHMFࣾͷճ͠ऀͰ͸͍͟͝·ͤΜ

    View Slide

  40. 5IBOLT
    ֆCZ͋ΘΏ͖

    View Slide