GAEを用いたBQ Load戦略/gae_bq_load_strategy

Ca4df28501e4c9cfbceb91f367afa784?s=47 fuzyco
April 24, 2018

GAEを用いたBQ Load戦略/gae_bq_load_strategy

bq_sushi#7での発表資料

Ca4df28501e4c9cfbceb91f367afa784?s=128

fuzyco

April 24, 2018
Tweet

Transcript

  1. copyright Fringe81 Co.,Ltd. GAEBQ Load Fringe81    

      1
  2. copyright Fringe81 Co.,Ltd.  l  (Hiroki Fujino) l @Fringe81

    l   l  2
  3. copyright Fringe81 Co.,Ltd.  /  Ø  Ø 

    l GAE"   l BQ Load l   #! l GAETaskQueue 3
  4. copyright Fringe81 Co.,Ltd. 4  

  5. copyright Fringe81 Co.,Ltd. Columva l  (2017/11~) l SaaS l

    ETL 5
  6. copyright Fringe81 Co.,Ltd. Columva 6 l  (2017/11~) l SaaS

    l ETL Columva GAEBQ GCP 
  7. copyright Fringe81 Co.,Ltd.    App Engine Big Query

    App Engine  7 /
  8. copyright Fringe81 Co.,Ltd.  lStandard Environment lPaaS l  

    lJava8/Scala l 8   App Engine Big Query App Engine /
  9. copyright Fringe81 Co.,Ltd.  l l&"# l&!% lAPI  l

    $  9  App Engine Big Query App Engine /
  10. copyright Fringe81 Co.,Ltd. BigQuery Big Query In Out 10

  11. copyright Fringe81 Co.,Ltd. BigQuery lSQL l Big Query In Out

    11
  12. copyright Fringe81 Co.,Ltd. BigQuery l   Big Query In

    Out 12
  13. copyright Fringe81 Co.,Ltd.  13

  14. copyright Fringe81 Co.,Ltd. BQ      14

  15. copyright Fringe81 Co.,Ltd.     15

  16. copyright Fringe81 Co.,Ltd. BigQuery    16 

  17. copyright Fringe81 Co.,Ltd.   17    BigQuery

       
  18. copyright Fringe81 Co.,Ltd. 18 %#"  TaskQueue   GAE

    SE& !$60   GAE
  19. copyright Fringe81 Co.,Ltd. App Engine Push Queue Push Queue App

    Engine App Engine 1 3 2 2 Enqueue    Push Queue  GAE 2 Push 3 Enqueue 3 Push 19
  20. copyright Fringe81 Co.,Ltd. App Engine Push Queue "! Enqueue GAE

    "! #$ Push Queue       20
  21. copyright Fringe81 Co.,Ltd. App Engine Push Queue  *)'#Enqueue GAE

      *)'#&,.! Push Queue  #  "!    )(.-$%+  21
  22. copyright Fringe81 Co.,Ltd.    BQ   

     22 BigQuery  
  23. copyright Fringe81 Co.,Ltd.   !"  #  

     23
  24. copyright Fringe81 Co.,Ltd.    l Ver.1 (Streaming Insert)

    l Ver.2 (BQ Load) l Ver.3 (Pull Queue  BQ Load) 24
  25. copyright Fringe81 Co.,Ltd. l Ver.1 (Streaming Insert) l Ver.2 (BQ

    Load) l Ver.3 (Pull Queue  BQ Load)    25
  26. copyright Fringe81 Co.,Ltd. Streaming Insert BQ    

    Streaming Insert   Ver.1(Streaming Insert) 26 BigQuery 
  27. copyright Fringe81 Co.,Ltd. Streaming Insert l   BigQuery 

      l $0.01/200MB BigQuery Streaming Insert Table User id INTEGER name STRING 27   User(id: Long, name: String)
  28. copyright Fringe81 Co.,Ltd.  28

  29. copyright Fringe81 Co.,Ltd.  BQ     

    BQ   29
  30. copyright Fringe81 Co.,Ltd.  BQ!"  !"   

    !"  !"  BQ!"    !"  ## 30
  31. copyright Fringe81 Co.,Ltd. Streaming Insert   SELECT DELETE 

     BigQuery SELECT DELETE Streaming Insert 31
  32. copyright Fringe81 Co.,Ltd. Streaming Insert streaming insert   BigTable

      select BigTable  [https://cloud.google.com/blog/big-data/2017/06/life-of-a-bigquery-streaming-insert] 32
  33. copyright Fringe81 Co.,Ltd. Streaming Insert streaming insert  # BigTable!"

    select BigTable!"  33 deleteupdate  90  [https://cloud.google.com/blog/big-data/2017/06/life-of-a-bigquery-streaming-insert]
  34. copyright Fringe81 Co.,Ltd. Streaming Insert 34 l%*$(1("!#& 1MB) l()$ '

      %*$ %*$ 
  35. copyright Fringe81 Co.,Ltd. Streaming Insert   90  

    35
  36. copyright Fringe81 Co.,Ltd. Streaming Insert 90   BQ 

        36
  37. copyright Fringe81 Co.,Ltd. l Ver.1 (Streaming Insert) l Ver.2 (BQ

    Load) l Ver.3 (Pull Queue  BQ Load)    37
  38. copyright Fringe81 Co.,Ltd. GCS BQ Load BQ! !  !

    Cloud Storage   Ver.2 (BQ Load)   BQ Load BigQuery 38  
  39. copyright Fringe81 Co.,Ltd. GCS BQ Load BQ! s  

    Ver.2 (BQ Load) 39 !  ! Cloud Storage   BQ Load BigQuery  
  40. copyright Fringe81 Co.,Ltd. BQ Load 40 lGoogle   

    lSDK  (Java, Python, Go…)
  41. copyright Fringe81 Co.,Ltd. BQ Load CSV JSON BigQuery Load 

    1 BQ     Load 41 Table User id INTEGER name STRING id, name 1, “hoge” 2, “fuga” {“id”: 1, “name”: “hoge”} {“id”: 2, “name”: “fuga”}
  42. copyright Fringe81 Co.,Ltd. BQ Load GCP   Cloud Storage

    Cloud Dataflow Load Load BigQuery 42
  43. copyright Fringe81 Co.,Ltd. GCS BQ Load BQ!   Ver.2

    (BQ Load) 43 !  ! Cloud Storage   BQ Load BigQuery  
  44. copyright Fringe81 Co.,Ltd. Google Cloud Storage !, )$(./# GCS <=>

    BQ'/%  "$&+'/%*-   GCSJSON*- 44
  45. copyright Fringe81 Co.,Ltd. GCS => BQ Cloud Storage BQ Load

    hoge1.json hoge_table hoge2.json hoge3.json hoge_table hoge_table BigQuery 45
  46. copyright Fringe81 Co.,Ltd. BQ Load l Streaming Insert% BQ Load%

    l -.,+* & ) l $-.,+ *)! ) l ACID +  u *-.,% #)'#")(Atomicity) u -.,%)("(Consistency) 46
  47. copyright Fringe81 Co.,Ltd.     47

  48. copyright Fringe81 Co.,Ltd. BQ Load error=BigQueryError{reason=rateLimitExceeded, location=table.write, message=Exceeded rate limits:

    too many table update operations for this table… 48
  49. copyright Fringe81 Co.,Ltd. rateLimitExceeded 49

  50. copyright Fringe81 Co.,Ltd. GCS BQ Load BQ %  

    Ver.2 (BQ Load) Load  50 % "$%! Cloud Storage  BQ Load BigQuery  #
  51. copyright Fringe81 Co.,Ltd.   BQ Load   BQ

    Load   51
  52. copyright Fringe81 Co.,Ltd. BQ Load l 15'"% 10 $" l

    * '"% 1,000  $"( ) l * #& (!'$" #& ) 10,000  $"( )  52
  53. copyright Fringe81 Co.,Ltd. BQ Load l 15)0836!10"%%/53$ l ; )0836!

    1,000 "%%/53$9 +&) l ; )47/,-19.28/53+"47/,-1:! 10,000 "%%/53$9 +&:  53   "#" +(Load*'! 
  54. copyright Fringe81 Co.,Ltd. BQ Load l 15$-614 10 ""+31! l

    9 $-614 1,000  ""+31!7 '#) l 9 $25+()/7*06+31'  25+()/8 10,000  ""+31!7 '#8  54 .6,& %
  55. copyright Fringe81 Co.,Ltd. BQ Load l 15"*3.1 10 (0. l

    6 "*3.1 1,000  (0.4 $ ) l 6 "/2(%&,4'-3(0.$ /2(%&,5 10,000  (0.4 $ 5  55 +3)$! BQLoad#7
  56. copyright Fringe81 Co.,Ltd. BQ Load    Cloud Storage

    gs://bucket/dir/* [gs://bucket/dir/test1.txt, gs://bucket/dir/test2.txt] [gs://bucket/dir1/*, gs://bucket/dir2/*]           Load Load Load BigQuery 56
  57. copyright Fringe81 Co.,Ltd. BQ Load    Cloud Storage

    gs://bucket/dir/* [gs://bucket/dir/test1.txt, gs://bucket/dir/test2.txt] [gs://bucket/dir1/*, gs://bucket/dir2/*]     1   Load Load Load BigQuery 57    
  58. copyright Fringe81 Co.,Ltd. BQ Load    Load 

     l  10,000URI l  10,000,000 l  15TB  58
  59. copyright Fringe81 Co.,Ltd.  *1&%BQ Load1"0#!0 Load *62391!0 .&' Load%0

    59 l <!&/5:79$() 1,000 *,,487+'; 1 -) 487* l 487' 10,000URI+' l 487' 10,000,0006239+' l 487' 15TB+' 1487&/*
  60. copyright Fringe81 Co.,Ltd. l Ver.1 (Streaming Insert) l Ver.2 (BQ

    Load) l Ver.3 (Pull Queue  BQ Load)    60
  61. copyright Fringe81 Co.,Ltd. GCS%),-($#!  Pull Queue BQLoad  BQ

    Load Pull Queue   +- %),-( '-$    Ver.3 (Pull Queue × BQ Load) Cloud Storage BigQuery 61 Pull Queue %),-( '-$  +- '-$ %),-( Cloud Storage  "#&*
  62. copyright Fringe81 Co.,Ltd. GCS%),-($#!  Pull Queue BQLoad  BQ

    Load   Ver.3 (Pull Queue × BQ Load) Cloud Storage BigQuery 62 Pull Queue %),-( '-$  +- '-$ %),-( Cloud Storage  "#&* Pull Queue   +- %),-( '-$ 
  63. copyright Fringe81 Co.,Ltd. Pull Queue &%$#   Queue 

    ! "   Pull Queue &%$3 &%$1 &%$2 Enqueue &%$#  63
  64. copyright Fringe81 Co.,Ltd. Pull Queue    Enqueue Pull

    Queue id: 3 id: 2 id: 1  3(payload: 3)  1(payload: 1)  2(payload: 2) Enqueue 64
  65. copyright Fringe81 Co.,Ltd. Pull Queue lease payload   

      Pull Queue Lease 3(payload: 3) 1(payload: 1) 2(payload: 2) id: 3 id: 2 id: 1 65
  66. copyright Fringe81 Co.,Ltd. Pull Queue   Load *&'-%$ "Load$

    66 l 0#).+- 1,000   (,+/ %!) (,+  l (,+10,000URI l (,+10,000,000*&'- l (,+15TB 1(,+ #
  67. copyright Fringe81 Co.,Ltd. Pull Queue   Load (4017/. ,#%Load".

    67 l :#-3857 &' 1,000 (**265)%9 /+) 265( l 265% 10,000URI)% l 265% 10,000,0004017)% l 265% 15TB)% 1265#-( Pull Queue'' Load (4017 /!$.
  68. copyright Fringe81 Co.,Ltd. GCS%),-($#!  Pull Queue BQLoad Pull Queue

    %),-( '-$  +-   Ver.3 (Pull Queue × BQ Load) Cloud Storage  BQ Load BigQuery 68 '-$ %),-( Cloud Storage  "#&* Pull Queue   +- %),-( '-$ 
  69. copyright Fringe81 Co.,Ltd. Pull Queue × BQ Load  BQ

    Load Cloud Storage Pull Queue Lease id: 3 id: 2 id: 1 Lease   BigQuery 69
  70. copyright Fringe81 Co.,Ltd.  70 l  1 100 5

     l    100
  71. copyright Fringe81 Co.,Ltd.  71 l (+* 1 100!5 

    l -(+* #!)&', "  100)&', GAE% BQ Load  %$
  72. copyright Fringe81 Co.,Ltd.  72  %  *0 0

     Streaming Insert  1MB/1, % ,-!) /0% 90  BQ Load  15TB/1+' 1000+'/1$0'- 10000+'/1(.% &#"
  73. copyright Fringe81 Co.,Ltd.  l BQ Load *$%-# l GAE+',.-#"!

    BQ Load#! l GAE&()  73
  74. copyright Fringe81 Co.,Ltd.  lGAE+TQ  , Dataflow  

      74
  75. copyright Fringe81 Co.,Ltd.    75