Slide 1

Slide 1 text

      

Slide 2

Slide 2 text

#" • \  Z-/7 6.<[ @ojima-h • QCF? SWGL* &DTYP • KYH 5+X)= • KYHU@C9KYHAB>MAG325 FGJR • VD5 ;(8.:,'X %715KYHOVY • BIIYT5! • $EYNG40< 5

Slide 3

Slide 3 text

   4,500 

Slide 4

Slide 4 text

XFLAG STUDIO • $%& % % • !%&"  •  #& • !%&"    • !%  • & •    • FC • " #% • XFLAG PARK • XFLAG STORE SHIBUYA • etc… • Coming soon…

Slide 5

Slide 5 text

    200+

Slide 6

Slide 6 text

  1 PB @s3 2 TB / day •  • DB   •  

Slide 7

Slide 7 text

 • EMR • m4.2xlarge x 20 core   • Master / Core  •    • Redshift • ds2.8xlarge x 3 (48TB)

Slide 8

Slide 8 text

 KPI  !"! "     

Slide 9

Slide 9 text



Slide 10

Slide 10 text

 ~ 2014 11 DAUKPI !"    

Slide 11

Slide 11 text

  

Slide 12

Slide 12 text

 EMR    Hive      

Slide 13

Slide 13 text

  #/%,"+0( "$'&   $#.  +0(!  EMR !    Hive    -*)! 

Slide 14

Slide 14 text

 EMR   Hive     

Slide 15

Slide 15 text

 EMR  16 • EMR  • Hive Metastore  

Slide 16

Slide 16 text

 EMR   → Hive Metastore 

Slide 17

Slide 17 text

 Hive Metastore  18 • Hive Metastore RDS  • Metastore server 

Slide 18

Slide 18 text

 Hive Metastore  •EMR % … • $#"$!    • Spark SQL  Redshift Spectrum 

Slide 19

Slide 19 text

3 4 Glue Data Catalog • +/,)2!(-$&2$"' • Hive Metastore  • 02.2&2$ "2,  Glue Data Catalog   3*% 1(DB  #")% …4

Slide 20

Slide 20 text

 EMR   Hive     

Slide 21

Slide 21 text

  Hive  Why Hive? • SQL • Hive Metastore •  Hive   

Slide 22

Slide 22 text

 Hive   STEP1. ORC   STEP2.   STEP3. 

Slide 23

Slide 23 text

 Hive   STEP1. ORC   STEP2.   STEP3. 

Slide 24

Slide 24 text

ORC • Hive  "$& • !  •  % $# • # • ACID transaction  Complex Data Type $

Slide 25

Slide 25 text

ORC •  ORC   •   

Slide 26

Slide 26 text

 • TEZ Engine • Cost Based Optimization • Vectorization 

Slide 27

Slide 27 text

 Hive   STEP1. ORC   STEP2.   STEP3. 

Slide 28

Slide 28 text

  •  Application Log    •  API   Application Log  (1TB/day)

Slide 29

Slide 29 text

   Dynamic Partition • Dynamic Partition • API Log / Error Log / Custom Log  

Slide 30

Slide 30 text

 • API Log   API   

Slide 31

Slide 31 text

)&-/ • API Log !  API ,0+$(./#" 3 • ,0+$(./ 211 • ,0+$(./'%* • API Log )&-/ 

Slide 32

Slide 32 text

   Sort • API Log  URI   • ORC  index  •   API   

Slide 33

Slide 33 text

   Sort • ORC  • API Log  URI  • INSERT OVERWRITE api_log SELECT … FROM … DISTRIBUTE BY RAND() SORT BY uri • PPD  • hive.optimize.index.filter: true • hive.optimize.ppd: true • hive.optimize.ppd.storage: true

Slide 34

Slide 34 text

 Hive   STEP1. ORC  STEP2.   STEP3. 

Slide 35

Slide 35 text

 •   EMR    Task 1 Task 2 Task 3 Task 4

Slide 36

Slide 36 text

 •    Task 1 Task 2 Task 3 Task 4  •   • 

Slide 37

Slide 37 text

Hive  15 → 

Slide 38

Slide 38 text

 EMR   Hive     

Slide 39

Slide 39 text

 %$#   Hive & •  • "!  • "!  

Slide 40

Slide 40 text

    → 

Slide 41

Slide 41 text

'-,+/0*0.# • )(&"! !)(&% •  !  )(&"$% • ! or " • Luigi, Airflow, Digdag  

Slide 42

Slide 42 text

  

Slide 43

Slide 43 text

 ,54389)  • ',54389197) • .-+' &%"!$#20/& • 20/'6*26 &%( !&%"!

Slide 44

Slide 44 text

 EMR  Hive    

Slide 45

Slide 45 text

  !  !

Slide 46

Slide 46 text

       

Slide 47

Slide 47 text

 $"!%  • #!%   SSH  

Slide 48

Slide 48 text

     → BI

Slide 49

Slide 49 text

 BI  • Zeppelin • metabase • re:dash •  

Slide 50

Slide 50 text

  BI$*(  • ECS  • !%'"*#&)  • Docker image   • Task & Service  CloudFormation • ALB  CloudWatch Logs 

Slide 51

Slide 51 text

  !    !

Slide 52

Slide 52 text

SELECT … FROM (SELECT * FROM (SELECT user_id, game_id, stage_id FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}') AS a JOIN (SELECT NVL(host_game_id, game_id) AS host_game_id, COUNT(*) AS players_num FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}’ GROUP BY NVL(host_game_id, game_id)) AS b ON a.game_id = b.host_game_id WHERE players_num > 1) …    

Slide 53

Slide 53 text

  )'9  (85 user_id 2;&6= ID game_id > *=570=)ID host_game_id 4:.2;&*,/3,/ game_id 4:.2;& 3,/#!-<2;&   game_id … +=1=  $%" 

Slide 54

Slide 54 text

   → 

Slide 55

Slide 55 text

 #!1 "0- user_id *3 .5 ID game_id 6$5-/)5#ID host_game_id ,2'*3 $%(+%( game_id ,2'*3 +%(&4*3   game_id is_multi ,2'*3  TRUE

Slide 56

Slide 56 text

SELECT … FROM (SELECT * FROM (SELECT user_id, game_id, stage_id FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}') AS a JOIN (SELECT NVL(host_game_id, game_id) AS host_game_id, COUNT(*) AS players_num FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}’ GROUP BY NVL(host_game_id, game_id)) AS b ON a.game_id = b.host_game_id WHERE players_num > 1) … SELECT … FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}’ AND is_multi …

Slide 57

Slide 57 text

 $"#% • !$   •    ⇒ Dimensional Modeling

Slide 58

Slide 58 text

Dimensional Modeling • Fact  Dimension (= Star Schema) • Fact … "'591! %8*:2 • Dimension … "'591!%  • 0:."#& • (76#47)" • ,-/6" ⇒ +:3-" • Dimension & % $& Fact Dimension Dimension Dimension Dimension

Slide 59

Slide 59 text

   %.*-  &.# Dimensional Modeling  •  ").'$+ •    • ! ,(.'-

Slide 60

Slide 60 text

    !  !

Slide 61

Slide 61 text

 $ •$ • " #  • SQL  ! • Hive   • 

Slide 62

Slide 62 text

    →  

Slide 63

Slide 63 text

 )0&  #-+*/0  )0&' ("1Data-QA2 • )0&%0$  • Login API vs. users.last_login_time •     • NULL!., • COUNT(*) vs. COUNT(col) UnitTest 

Slide 64

Slide 64 text

3.-*/ ! EMR 2)' → Hive metastore  " Hive )' →  # A<;2)  → 9DCBGI $ 7H9@6,(?I:1+540' → BI=IF  % 87E) → >ICF2  & ?I:2 → ?I:2 

Slide 65

Slide 65 text



Slide 66

Slide 66 text

  KPI    

Slide 67

Slide 67 text

  KPI    

Slide 68

Slide 68 text

  

Slide 69

Slide 69 text

;A01B04=*-! ,(?1*: & " 1.   2. /B3)@AGET,B. 7=A5+28 

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

,4$  • https://www.monster-strike.com/promotion/12shi/#yosou • YouTube 210A3 4 9)+79  • 3./% 4#9& '*-?@=@0!9 (& • ' )87504"%6:249<;>3 

Slide 72

Slide 72 text

   25,000 records / s Kinesis 1 shard EMR c3.2xlarge x 2

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

 #8)67GET /7,87 • https://www.monster-strike.com/promotion/kiwami2017/unkyoku_ice.html • "8. /2!( #8)67&!38. !8-7+5'7* • 08$8+5'7*!8-734(. ,58 %17 

Slide 75

Slide 75 text

   GET   : 25,000 records / s DynamoDB: 1RCU / 1WCU

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

15% EH?@BINGO • https://www.monster-strike.com/promotion/winter2017/bingo.html • FI>I BH= :' • 4BH=I:$/& 3)60' • BH=DIG+.92*6&3,8#7-

Slide 78

Slide 78 text

$ 25-/BINGO • Redshift )  • 05+$4,.*)SQL  • 05+163('%# $ BINGO ) • 1~10 &" )! 

Slide 79

Slide 79 text

No content

Slide 80

Slide 80 text

' ;-/9DG2 F9BG6G#74., • 'CF1F2, 5/9& •  'E2,JOIN"+! )Lambda * Flink % $#(

Slide 81

Slide 81 text

   •  Kinesis Firehose Redshift  • Redshift  JOIN 

Slide 82

Slide 82 text

>1?72=(#$ • (-"$ *  • AWS )4@;60+/ .&%(%. • S3 , Kinesis 0 .&% 569=(0.&' >1?72=01:<83(.&%!

Slide 83

Slide 83 text



Slide 84

Slide 84 text

 • %#    • %"!$ • ML / AI  

Slide 85

Slide 85 text

!