Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GAEを用いたBQ Load戦略/gae_bq_load_strategy

fuzyco
April 24, 2018

GAEを用いたBQ Load戦略/gae_bq_load_strategy

bq_sushi#7での発表資料

fuzyco

April 24, 2018
Tweet

More Decks by fuzyco

Other Decks in Technology

Transcript

  1. copyright Fringe81 Co.,Ltd.
    GAEBQ Load
    Fringe81


    1

    View Slide

  2. copyright Fringe81 Co.,Ltd.

    l (Hiroki Fujino)
    l @Fringe81
    l

    l
    2

    View Slide

  3. copyright Fringe81 Co.,Ltd.
    /
    Ø
    Ø
    l GAE"
    l BQ Load
    l
    #!
    l GAETaskQueue
    3

    View Slide

  4. copyright Fringe81 Co.,Ltd. 4

    View Slide

  5. copyright Fringe81 Co.,Ltd.
    Columva
    l (2017/11~)
    l SaaS
    l ETL
    5

    View Slide

  6. copyright Fringe81 Co.,Ltd.
    Columva
    6
    l (2017/11~)
    l SaaS
    l ETL
    Columva GAEBQ
    GCP

    View Slide

  7. copyright Fringe81 Co.,Ltd.


    App Engine Big Query App Engine


    7
    /

    View Slide

  8. copyright Fringe81 Co.,Ltd.

    lStandard Environment
    lPaaS
    l
    lJava8/Scala
    l

    8

    App Engine Big Query App Engine
    /

    View Slide

  9. copyright Fringe81 Co.,Ltd.

    l
    l&"#
    l&!%
    lAPI
    l $
    9


    App Engine Big Query App Engine
    /

    View Slide

  10. copyright Fringe81 Co.,Ltd.
    BigQuery
    Big Query
    In Out
    10

    View Slide

  11. copyright Fringe81 Co.,Ltd.
    BigQuery
    lSQL
    l
    Big Query
    In Out
    11

    View Slide

  12. copyright Fringe81 Co.,Ltd.
    BigQuery
    l


    Big Query
    In Out
    12

    View Slide

  13. copyright Fringe81 Co.,Ltd.

    13

    View Slide

  14. copyright Fringe81 Co.,Ltd.
    BQ


    14

    View Slide

  15. copyright Fringe81 Co.,Ltd.


    15

    View Slide

  16. copyright Fringe81 Co.,Ltd.
    BigQuery


    16


    View Slide

  17. copyright Fringe81 Co.,Ltd.

    17



    BigQuery


    View Slide

  18. copyright Fringe81 Co.,Ltd. 18
    %#" TaskQueue
    GAE SE& !$60

    GAE

    View Slide

  19. copyright Fringe81 Co.,Ltd.
    App Engine Push Queue Push Queue
    App Engine App Engine
    1 3
    2
    2
    Enqueue


    Push Queue

    GAE
    2
    Push
    3
    Enqueue
    3
    Push
    19

    View Slide

  20. copyright Fringe81 Co.,Ltd.
    App Engine Push Queue
    "! Enqueue
    GAE
    "! #$
    Push Queue



    20

    View Slide

  21. copyright Fringe81 Co.,Ltd.
    App Engine Push Queue

    *)'#Enqueue
    GAE


    *)'#&,.!
    Push Queue # "!


    )(.-$%+

    21

    View Slide

  22. copyright Fringe81 Co.,Ltd.

    BQ


    22
    BigQuery


    View Slide

  23. copyright Fringe81 Co.,Ltd.
    !"

    #
    23

    View Slide

  24. copyright Fringe81 Co.,Ltd.


    l Ver.1 (Streaming Insert)
    l Ver.2 (BQ Load)
    l Ver.3 (Pull Queue BQ Load)
    24

    View Slide

  25. copyright Fringe81 Co.,Ltd.
    l Ver.1 (Streaming Insert)
    l Ver.2 (BQ Load)
    l Ver.3 (Pull Queue BQ Load)


    25

    View Slide

  26. copyright Fringe81 Co.,Ltd.
    Streaming InsertBQ


    Streaming Insert
    Ver.1(Streaming Insert)
    26
    BigQuery

    View Slide

  27. copyright Fringe81 Co.,Ltd.
    Streaming Insert
    l BigQuery

    l $0.01/200MB
    BigQuery
    Streaming Insert Table User
    id INTEGER
    name STRING
    27

    User(id: Long, name: String)

    View Slide

  28. copyright Fringe81 Co.,Ltd.

    28

    View Slide

  29. copyright Fringe81 Co.,Ltd.

    BQ


    BQ
    29

    View Slide

  30. copyright Fringe81 Co.,Ltd.

    BQ!" !"

    !"
    !" BQ!"
    !"
    ##
    30

    View Slide

  31. copyright Fringe81 Co.,Ltd.
    Streaming Insert

    SELECT DELETE
    BigQuery
    SELECT
    DELETE
    Streaming Insert
    31

    View Slide

  32. copyright Fringe81 Co.,Ltd.
    Streaming Insert
    streaming insert
    BigTable
    select BigTable
    [https://cloud.google.com/blog/big-data/2017/06/life-of-a-bigquery-streaming-insert]
    32

    View Slide

  33. copyright Fringe81 Co.,Ltd.
    Streaming Insert
    streaming insert # BigTable!"
    select
    BigTable!"
    33
    deleteupdate 90
    [https://cloud.google.com/blog/big-data/2017/06/life-of-a-bigquery-streaming-insert]

    View Slide

  34. copyright Fringe81 Co.,Ltd.
    Streaming Insert
    34
    l%*$(1("!#& 1MB)
    l()$ '

    %*$ %*$

    View Slide

  35. copyright Fringe81 Co.,Ltd.
    Streaming Insert
    90

    35

    View Slide

  36. copyright Fringe81 Co.,Ltd.
    Streaming Insert
    90
    BQ


    36

    View Slide

  37. copyright Fringe81 Co.,Ltd.
    l Ver.1 (Streaming Insert)
    l Ver.2 (BQ Load)
    l Ver.3 (Pull Queue BQ Load)


    37

    View Slide

  38. copyright Fringe81 Co.,Ltd.
    GCS
    BQ LoadBQ!
    !
    !
    Cloud Storage
    Ver.2 (BQ Load)

    BQ Load
    BigQuery
    38

    View Slide

  39. copyright Fringe81 Co.,Ltd.
    GCS
    BQ LoadBQ!
    s
    Ver.2 (BQ Load)
    39
    !
    !
    Cloud Storage

    BQ Load
    BigQuery

    View Slide

  40. copyright Fringe81 Co.,Ltd.
    BQ Load
    40
    lGoogle
    lSDK
    (Java, Python, Go…)

    View Slide

  41. copyright Fringe81 Co.,Ltd.
    BQ Load
    CSV
    JSON
    BigQuery
    Load
    1

    BQ

    Load
    41
    Table User
    id INTEGER
    name STRING
    id, name
    1, “hoge”
    2, “fuga”
    {“id”: 1, “name”: “hoge”}
    {“id”: 2, “name”: “fuga”}

    View Slide

  42. copyright Fringe81 Co.,Ltd.
    BQ Load
    GCP

    Cloud Storage
    Cloud Dataflow
    Load
    Load
    BigQuery
    42

    View Slide

  43. copyright Fringe81 Co.,Ltd.
    GCS
    BQ LoadBQ!
    Ver.2 (BQ Load)
    43
    !
    !
    Cloud Storage

    BQ Load
    BigQuery

    View Slide

  44. copyright Fringe81 Co.,Ltd.
    Google Cloud Storage
    !, )$(./#
    GCS <=> BQ'/%

    "$&+'/%*-
    GCSJSON*-
    44

    View Slide

  45. copyright Fringe81 Co.,Ltd.
    GCS => BQ
    Cloud Storage
    BQ Load
    hoge1.json hoge_table
    hoge2.json
    hoge3.json
    hoge_table
    hoge_table
    BigQuery
    45

    View Slide

  46. copyright Fringe81 Co.,Ltd.
    BQ Load
    l Streaming Insert%BQ Load%
    l -.,+* & )
    l $-.,+
    *)! )
    l ACID +
    u *-.,% #)'#")(Atomicity)
    u -.,%)("(Consistency)
    46

    View Slide

  47. copyright Fringe81 Co.,Ltd.


    47

    View Slide

  48. copyright Fringe81 Co.,Ltd.
    BQ Load
    error=BigQueryError{reason=rateLimitExceeded,
    location=table.write, message=Exceeded rate limits:
    too many table update operations for this table…
    48

    View Slide

  49. copyright Fringe81 Co.,Ltd.
    rateLimitExceeded
    49

    View Slide

  50. copyright Fringe81 Co.,Ltd.
    GCS
    BQ LoadBQ %
    Ver.2 (BQ Load)
    Load
    50
    %
    "$%!
    Cloud Storage

    BQ Load
    BigQuery
    #

    View Slide

  51. copyright Fringe81 Co.,Ltd.

    BQ Load

    BQ Load

    51

    View Slide

  52. copyright Fringe81 Co.,Ltd.
    BQ Load
    l 15'"%10 $"
    l *
    '"% 1,000 $"( )
    l *
    #& (!'$"#& )
    10,000 $"( )

    52

    View Slide

  53. copyright Fringe81 Co.,Ltd.
    BQ Load
    l 15)0836!10"%%/53$
    l ;)0836! 1,000 "%%/53$9
    +&)
    l ;)47/,-19.28/53+"47/,-1:!
    10,000 "%%/53$9
    +&:

    53
    "#"
    +(Load*'!

    View Slide

  54. copyright Fringe81 Co.,Ltd.
    BQ Load
    l 15$-614 10 ""+31!
    l 9$-614 1,000 ""+31!7
    '#)
    l 9$25+()/7*06+31' 25+()/8
    10,000 ""+31!7
    '#8

    54
    .6,&%

    View Slide

  55. copyright Fringe81 Co.,Ltd.
    BQ Load
    l 15"*3.110 (0.
    l 6
    "*3.1 1,000 (0.4 $ )
    l 6
    "/2(%&,4'-3(0.$/2(%&,5
    10,000 (0.4 $ 5

    55
    +3)$!
    BQLoad#7

    View Slide

  56. copyright Fringe81 Co.,Ltd.
    BQ Load

    Cloud Storage
    gs://bucket/dir/*
    [gs://bucket/dir/test1.txt,
    gs://bucket/dir/test2.txt]
    [gs://bucket/dir1/*,
    gs://bucket/dir2/*]








    Load
    Load
    Load
    BigQuery
    56

    View Slide

  57. copyright Fringe81 Co.,Ltd.
    BQ Load

    Cloud Storage
    gs://bucket/dir/*
    [gs://bucket/dir/test1.txt,
    gs://bucket/dir/test2.txt]
    [gs://bucket/dir1/*,
    gs://bucket/dir2/*]


    1


    Load
    Load
    Load
    BigQuery
    57


    View Slide

  58. copyright Fringe81 Co.,Ltd.
    BQ Load

    Load


    l 10,000URI
    l 10,000,000
    l 15TB

    58

    View Slide

  59. copyright Fringe81 Co.,Ltd.

    *1&%BQ Load1"0#!0
    Load*62391!0 .&' Load%0
    59
    l !&/5:79$() 1,000 *,,487+';1 -)
    487*
    l 487'
    10,000URI+'
    l 487'
    10,000,0006239+'
    l 487'
    15TB+'
    1487&/*

    View Slide

  60. copyright Fringe81 Co.,Ltd.
    l Ver.1 (Streaming Insert)
    l Ver.2 (BQ Load)
    l Ver.3 (Pull Queue BQ Load)


    60

    View Slide

  61. copyright Fringe81 Co.,Ltd.
    GCS%),-($#!
    Pull QueueBQLoad

    BQ Load
    Pull Queue

    +-
    %),-(
    '-$
    Ver.3
    (Pull Queue × BQ Load)
    Cloud Storage BigQuery
    61
    Pull Queue
    %),-(
    '-$ +-
    '-$
    %),-(
    Cloud Storage
    "#&*

    View Slide

  62. copyright Fringe81 Co.,Ltd.
    GCS%),-($#!
    Pull QueueBQLoad

    BQ Load
    Ver.3
    (Pull Queue × BQ Load)
    Cloud Storage BigQuery
    62
    Pull Queue
    %),-(
    '-$ +-
    '-$
    %),-(
    Cloud Storage
    "#&*
    Pull Queue

    +-
    %),-(
    '-$

    View Slide

  63. copyright Fringe81 Co.,Ltd.
    Pull Queue
    &%$# Queue !
    "

    Pull Queue
    &%$3
    &%$1
    &%$2
    Enqueue
    &%$#

    63

    View Slide

  64. copyright Fringe81 Co.,Ltd.
    Pull Queue


    Enqueue
    Pull Queue
    id: 3
    id: 2
    id: 1
    3(payload: 3)
    1(payload: 1)
    2(payload: 2)
    Enqueue
    64

    View Slide

  65. copyright Fringe81 Co.,Ltd.
    Pull Queue
    leasepayload

    Pull Queue
    Lease
    3(payload: 3)
    1(payload: 1)
    2(payload: 2)
    id: 3
    id: 2
    id: 1
    65

    View Slide

  66. copyright Fringe81 Co.,Ltd.
    Pull Queue

    Load *&'-%$
    "Load$
    66
    l 0#).+- 1,000 (,+/%!)
    (,+
    l (,+10,000URI
    l (,+10,000,000*&'-
    l (,+15TB
    1(,+#

    View Slide

  67. copyright Fringe81 Co.,Ltd.
    Pull Queue

    Load(4017/. ,#%Load".
    67
    l :#-3857 &' 1,000 (**265)%9/+)
    265(
    l 265%
    10,000URI)%
    l 265%
    10,000,0004017)%
    l 265%
    15TB)%
    1265#-(
    Pull Queue''
    Load(4017
    /!$.

    View Slide

  68. copyright Fringe81 Co.,Ltd.
    GCS%),-($#!
    Pull QueueBQLoad
    Pull Queue
    %),-(
    '-$ +-
    Ver.3
    (Pull Queue × BQ Load)
    Cloud Storage

    BQ Load
    BigQuery
    68
    '-$
    %),-(
    Cloud Storage
    "#&*
    Pull Queue

    +-
    %),-(
    '-$

    View Slide

  69. copyright Fringe81 Co.,Ltd.
    Pull Queue × BQ Load

    BQ Load
    Cloud Storage
    Pull Queue
    Lease
    id: 3
    id: 2
    id: 1 Lease


    BigQuery
    69

    View Slide

  70. copyright Fringe81 Co.,Ltd.

    70
    l 1
    100 5
    l 100

    View Slide

  71. copyright Fringe81 Co.,Ltd.

    71
    l (+*1 100!5
    l -(+* #!)&'," 100)&',
    GAE% BQ Load

    %$

    View Slide

  72. copyright Fringe81 Co.,Ltd.

    72
    % *0 0
    Streaming
    Insert

    1MB/1, % ,-!) /0%
    90

    BQ Load 15TB/1+'
    1000+'/1$0'-
    10000+'/1(.%
    "

    View Slide

  73. copyright Fringe81 Co.,Ltd.

    l BQ Load *$%-#


    l GAE+',.-#"!
    BQ Load#!
    l GAE&()

    73

    View Slide

  74. copyright Fringe81 Co.,Ltd.

    lGAE+TQ
    ,
    Dataflow
    74

    View Slide

  75. copyright Fringe81 Co.,Ltd.


    75

    View Slide