Slide 1

Slide 1 text

Slide 2

Slide 2 text

#" • \ Z-/7 6.<[ @ojima-h • QCF? SWGL* &DTYP • KYH 5+X)= • KYHU@C9KYHAB>MAG325 FGJR • VD5 ;(8.:,'X %715KYHOVY • BIIYT5! • $EYNG40< 5

Slide 3

Slide 3 text

4,500

Slide 4

Slide 4 text

XFLAG STUDIO • $%& % % • !%&" • #& • !%&" • !% • & • • FC • " #% • XFLAG PARK • XFLAG STORE SHIBUYA • etc… • Coming soon…

Slide 5

Slide 5 text

200+

Slide 6

Slide 6 text

1 PB @s3 2 TB / day • • DB •

Slide 7

Slide 7 text

• EMR • m4.2xlarge x 20 core • Master / Core • • Redshift • ds2.8xlarge x 3 (48TB)

Slide 8

Slide 8 text

KPI !"! "

Slide 9

Slide 9 text

Slide 10

Slide 10 text

~ 2014 11 DAUKPI !"

Slide 11

Slide 11 text

Slide 12

Slide 12 text

EMR Hive

Slide 13

Slide 13 text

#/%,"+0( "$'& $#. +0(! EMR ! Hive -*)!

Slide 14

Slide 14 text

EMR Hive

Slide 15

Slide 15 text

EMR 16 • EMR • Hive Metastore

Slide 16

Slide 16 text

EMR → Hive Metastore

Slide 17

Slide 17 text

Hive Metastore 18 • Hive Metastore RDS • Metastore server

Slide 18

Slide 18 text

Hive Metastore •EMR % … • $#"$! • Spark SQL Redshift Spectrum

Slide 19

Slide 19 text

34 Glue Data Catalog • +/,)2!(-$&2$"' • Hive Metastore • 02.2&2$"2, Glue Data Catalog 3*% 1(DB #")% …4

Slide 20

Slide 20 text

EMR Hive

Slide 21

Slide 21 text

Hive Why Hive? • SQL • Hive Metastore • Hive

Slide 22

Slide 22 text

Hive STEP1. ORC STEP2. STEP3.

Slide 23

Slide 23 text

Hive STEP1. ORC STEP2. STEP3.

Slide 24

Slide 24 text

ORC • Hive "$& • ! • % $# • # • ACID transaction Complex Data Type $

Slide 25

Slide 25 text

ORC • ORC •

Slide 26

Slide 26 text

• TEZ Engine • Cost Based Optimization • Vectorization

Slide 27

Slide 27 text

Hive STEP1. ORC STEP2. STEP3.

Slide 28

Slide 28 text

• Application Log • API Application Log (1TB/day)

Slide 29

Slide 29 text

Dynamic Partition • Dynamic Partition • API Log / Error Log / Custom Log

Slide 30

Slide 30 text

• API Log API

Slide 31

Slide 31 text

)&-/ • API Log ! API ,0+$(./#" 3 • ,0+$(./211 • ,0+$(./'%* • API Log )&-/

Slide 32

Slide 32 text

Sort • API Log URI • ORC index • API

Slide 33

Slide 33 text

Sort • ORC • API Log URI • INSERT OVERWRITE api_log SELECT … FROM … DISTRIBUTE BY RAND() SORT BY uri • PPD • hive.optimize.index.filter: true • hive.optimize.ppd: true • hive.optimize.ppd.storage: true

Slide 34

Slide 34 text

Hive STEP1. ORC STEP2. STEP3.

Slide 35

Slide 35 text

• EMR Task 1 Task 2 Task 3 Task 4

Slide 36

Slide 36 text

• Task 1 Task 2 Task 3 Task 4 • •

Slide 37

Slide 37 text

Hive 15 →

Slide 38

Slide 38 text

EMR Hive

Slide 39

Slide 39 text

%$# Hive & • • "! • "!

Slide 40

Slide 40 text

Slide 41

Slide 41 text

'-,+/0*0.# • )(&"! !)(&% • ! )(&"$% • !or" • Luigi, Airflow, Digdag

Slide 42

Slide 42 text

Slide 43

Slide 43 text

,54389) • ',54389197) • .-+' &%"!$#20/& • 20/'6*26&%( !&%"!

Slide 44

Slide 44 text

EMR Hive

Slide 45

Slide 45 text

! !

Slide 46

Slide 46 text

Slide 47

Slide 47 text

$"!% • #!% SSH

Slide 48

Slide 48 text

→ BI

Slide 49

Slide 49 text

BI • Zeppelin • metabase • re:dash •

Slide 50

Slide 50 text

BI$*( • ECS • !%'"*#&) • Docker image • Task & Service CloudFormation • ALB CloudWatch Logs

Slide 51

Slide 51 text

! !

Slide 52

Slide 52 text

SELECT … FROM (SELECT * FROM (SELECT user_id, game_id, stage_id FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}') AS a JOIN (SELECT NVL(host_game_id, game_id) AS host_game_id, COUNT(*) AS players_num FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}’ GROUP BY NVL(host_game_id, game_id)) AS b ON a.game_id = b.host_game_id WHERE players_num > 1) …

Slide 53

Slide 53 text

)'9 (85 user_id 2;&6= ID game_id > *=570=)ID host_game_id 4:.2;&*,/3,/ game_id 4:.2;& 3,/#!-<2;& game_id … +=1= $%"

Slide 54

Slide 54 text

Slide 55

Slide 55 text

#!1 "0- user_id *3 .5 ID game_id 6$5-/)5#ID host_game_id ,2'*3 $%(+%( game_id ,2'*3 +%(&4*3 game_id is_multi ,2'*3 TRUE

Slide 56

Slide 56 text

SELECT … FROM (SELECT * FROM (SELECT user_id, game_id, stage_id FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}') AS a JOIN (SELECT NVL(host_game_id, game_id) AS host_game_id, COUNT(*) AS players_num FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}’ GROUP BY NVL(host_game_id, game_id)) AS b ON a.game_id = b.host_game_id WHERE players_num > 1) … SELECT … FROM game_log WHERE CONCAT_WS('-',y,m,d) = '{{DATE}}’ AND is_multi …

Slide 57

Slide 57 text

$"#% • !$ • ⇒ Dimensional Modeling

Slide 58

Slide 58 text

Dimensional Modeling • Fact Dimension (= Star Schema) • Fact … "'591! %8*:2 • Dimension … "'591!% • 0:."#& • (76#47)" • ,-/6" ⇒ +:3-" • Dimension & % $& Fact Dimension Dimension Dimension Dimension

Slide 59

Slide 59 text

%.*- &.# Dimensional Modeling • ").'$+ • • ! ,(.'-

Slide 60

Slide 60 text

! !

Slide 61

Slide 61 text

$ •$ • " # • SQL ! • Hive •

Slide 62

Slide 62 text

Slide 63

Slide 63 text

)0& #-+*/0 )0&' ("1Data-QA2 • )0&%0$ • Login API vs. users.last_login_time • • NULL!., • COUNT(*) vs. COUNT(col) UnitTest

Slide 64

Slide 64 text

3.-*/ ! EMR 2)' → Hive metastore " Hive )' → # A<;2) → 9DCBGI $ 7H9@6,(?I:1+540' → BI=IF % 87E) → >ICF2 & ?I:2 → ?I:2

Slide 65

Slide 65 text

Slide 66

Slide 66 text

KPI

Slide 67

Slide 67 text

KPI

Slide 68

Slide 68 text

Slide 69

Slide 69 text

;A01B04=*-! ,(?1*: &" 1. 2. /B3)@AGET,B. 7=A5+28

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

,4$ • https://www.monster-strike.com/promotion/12shi/#yosou • YouTube 210A34 9)+79 • 3./% 4#9& '*-?@=@0!9 (& • ' )87504"%6:249<;>3

Slide 72

Slide 72 text

25,000 records / s Kinesis 1 shard EMR c3.2xlarge x 2

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

#8)67GET /7,87 • https://www.monster-strike.com/promotion/kiwami2017/unkyoku_ice.html • "8. /2!( #8)67&!38. !8-7+5'7* • 08$8+5'7*!8-734(.,58 %17

Slide 75

Slide 75 text

GET : 25,000 records / s DynamoDB: 1RCU / 1WCU

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

15% EH?@BINGO • https://www.monster-strike.com/promotion/winter2017/bingo.html • FI>I BH=:' • 4BH=I:$/& 3)60' • BH=DIG+.92*6&3,8#7-

Slide 78

Slide 78 text

$ 25-/BINGO • Redshift ) • 05+$4,.*)SQL • 05+163('%# $ BINGO ) • 1~10&" )!

Slide 79

Slide 79 text

No content

Slide 80

Slide 80 text

' ;-/9DG2 F9BG6G#74., • 'CF1F2,5/9& • 'E2,JOIN"+! )Lambda * Flink % $#(

Slide 81

Slide 81 text

• Kinesis Firehose Redshift • Redshift JOIN

Slide 82

Slide 82 text

>1?72=(#$ • (-"$ * • AWS )4@;60+/ .&%(%. • S3 , Kinesis 0 .&% 569=(0.&' >1?72=01:<83(.&%!

Slide 83

Slide 83 text

Slide 84

Slide 84 text

• %# • %"!$ • ML / AI

Slide 85

Slide 85 text

!