Slide 1

Slide 1 text

1ZUIPO"84&.3 顆罏ךؚٗ꧊鎘 1Z$PO+1 "LJSB$IJLV

Slide 2

Slide 2 text

BDIJLV /BNF"LJSB$IJLV 5XJUUFS!@BDIJLV (JU)VC!BDIJLV 馯㄂拦莸 "LJSB$IJLV'JSFד嗚稊 耵噟ؒٝآص،!LBONV

Slide 3

Slide 3 text

(PBM Ø  չז׈׉ך圓䧭זךַպח搊挿׾縧ְ׋✲⢽ךⰟ剣 Ø  չⰅꟌ⟃♳պ׾湡䭷׃׋1ZUIPO&.3崞欽倯岀ךⰟ剣 涺ׁ׿ך鑧׾耀ֹ׋ְ 荈ⴓָ

Slide 4

Slide 4 text

,BONV #VTJOFTT Ø  ؕ٦س⠓爡ה⼿噟׃׋寸幥ر٦ةⴓ匿 Ø  ؕ٦سח秡בֻؙ٦هٝךꂁ⥋ Ø  $BSE-JOLFE0FS $-0

Slide 5

Slide 5 text

$BSE-JOLFE0FS

Slide 6

Slide 6 text

$BSE-JOLFE0FS չ"䏄ךؙ٦هٝؒٝزٔ٦׃ ׋ؕ٦سד顠ְ暟ׅ׸לه ؎ٝز؜حزկպ չְֲֲֶֿ㹏ׁ׿ח ְֲֲֿؙ٦هٝ⳿׃׋ְպ չְֲֲֿ飑顠⫘ぢךֶ㹏ׁ׿ך倯 ְְָךדכպ չֿ׿ז穠卓׌׏׋ךדծ如㔐כֿ ְֲֲإًؚٝزⴖ׶ת׃׳ֲպ ؕ٦س ⠓爡 ,BONV ؕ٦س ⠓㆞ ֶ䏄

Slide 7

Slide 7 text

2VJDL4VSWFZ Ø  ؚٗⴓ匿חꟼ׻׏גְ׵׏׃ׯ׷倯 Ø  )BEPPQ⢪׏ג׵׏׃ׯ׷倯 Ø  )JWF⢪׏ג׵׏׃ׯ׷倯 Ø  &.3⢪׏ג׵׏׃ׯ׷倯

Slide 8

Slide 8 text

չז׈׉ך圓䧭זךַպ ח搊挿׾䔲ג׋✲⢽ךⰟ剣

Slide 9

Slide 9 text

䒦爡ך⵸䲿

Slide 10

Slide 10 text

ֿךز٦ؙךة؎زٕ

Slide 11

Slide 11 text

顆罏ךؚٗ꧊鎘

Slide 12

Slide 12 text

1PPSNBOˏT Ø  ➙֮׷植朐׾⯋ח Ø  満ٔا٦أ ➂儗꟦穗꿀 ד湡涸׾麦䧭ׅ׷㪦⹲ Ø  湡涸׾麦䧭ׅ׷أؾ٦س׾〳腉זꣲ׶♳־׷㪦⹲ Ø  搀欽ז佄⳿׾鼘ֽ׷㪦⹲

Slide 13

Slide 13 text

,BONV &OHJOFFS5FBN NBLJ $&0&OHJOFFS @JEFZVUB %FTJHOFS NPRBEB &OHJOFFS @BDIJLV &OHJOFFS 爡ꞿ噟灇瑔Ꟛ涪 رؠ؎ٝؿٗٝز أوم،فؚٔٗⴓ匿 ؿٗٝزغحؙؒٝس ؎ٝؿٓأوم،فٔ غحؙؒٝس؎ٝؿٓ ⴓ匿㛇湍ؚٗⴓ匿㼎ػ٦زش٦璞〡

Slide 14

Slide 14 text

3FRVJSFNFOUT Ø  ֮׷玎䏝ךꆀחז׷ر٦ة׾أزٖأ搀ֻ꧊鎘׃׋ְ •  "WF(EBZ .BY(EBZ ꬊ㖇簭 •  (#剢 ꬊ㖇簭 •  ؟٦ؽأך䧭ꞿהⰟח㟓ִ׷鋅鴥׫ •  剢⽃⡘ד،سمحؙזؙؒٔ׮䫎־׋ְ Ø  爡ⰻח㣐鋉垷ر٦ة׾Ⳣ椚ׅ׷濼鋅׾顕׭׋ְ •  չل٦أꂁⴓ׾׃׋♳דպ濼鋅׾顕׭׷ •  㢩鿇ח⳿׃חְֻإٝءذ؍ـזر٦ة׮㶷㖈 Ø  麊欽؝أزⴱ劍䫎项׾⡚ֻ䫇ִ׋ְ

Slide 15

Slide 15 text

/PU3FRVJSFNFOUT Ø  ⴓ匿ָٔ،ٕة؎يד֮׷䗳銲䚍כ植朐넝ֻזְ Ø  ،سمحؙⴓ匿㛇湍ך؟٦ؽأٖكٕכ寸׃ג넝ֻזְ •  兛鸐ךغحثⳢ椚כ衅׍גכ꼽湡׌ֽו •  䌢חⵃ欽〳腉ז朐䡾חז׏גְזֻג׮葺ְ •  ⵃ欽כ爡ⰻחꣲ㹀ׁ׸גְ׷ Ø  ׋׌׃ծ♳鎸ָ3FRVJSFNFOUTחז׷〳腉䚍כ⼧ⴓ剣׷

Slide 16

Slide 16 text

"NB[PO&MBTUJD.BQ3FEVDF

Slide 17

Slide 17 text

Ø  侧֮׷"84؟٦ؽأךֲ׍ך♧א Ø  )BEPPQװ)BEPPQؒ؝ءأذيⰻך48ָر ؿٕؓزדⵃ欽〳腉 Ø  "1*ד饯⹛ծ+PCך㹋遤ծ⨡姺׾乼⡲〳腉 Ø  ٌصةؚٔٝ瘝׮״׃זח㹋倵׃גֻ׸׷ Ø  4׾)%'4ך剏׶חⵃ欽〳腉 Ø  ؙٓأةך〴侧㢌刿ָ㺁僒 "84&.3

Slide 18

Slide 18 text

"SDIJUFDUVSF 盖椚؟٦غ ؙ٦هٝ ꂁ⥋؟٦غ ؙ٦هٝ ꂁ⥋؟٦غ •  ꂁ⥋؟٦غ♳ך'MVFOUEדؚٗ꧊《 •  VFOUETQMVHJOד ꧊׃׋ؚٗ׾ 4♳ח⥂㶷 •  &.3♳ך)JWFדؚٗ׾⸇䊨ծ꧊鎘 •  ꧊鎘⦼׾3%4ח⥂㶷׃ג〳鋔⻉

Slide 19

Slide 19 text

%BUB"OBMZTJT'MPX CZUBHPNPSJT 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F ⳿ⰩIUUQXXXTMJEFTIBSFOFUUBHPNPSJTIBOEMJOHOPUTPCJHEBUB

Slide 20

Slide 20 text

1PPSNBOˏT%BUB"OBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

Slide 21

Slide 21 text

$PMMFDU 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

Slide 22

Slide 22 text

$PMMFDU Ø  ؙ٦هٝꂁ⥋؟٦غַ׵'MVFOUEVFOUET QMVHJO׾ⵃ欽׃גؚٗ׾굲לׅ Ø  굲לؚׅٗכِ٦ؠך،ؙءّٝ׾2VFSZ4USJOH חろ׭ג굲לׅ •  醱꧟ז+40/כ굲לׁ׆ծ2VFSZ4USJOHח䞔㜠鯹ׇ׷ •  )JWFדך꧊鎘儗חⰋג+40/ח㢌䳔 •  IUUQTFYBNQMFDPNCFBDPO TVCPCKDPVQPOBDUJPODMJDLDJE Ø  'MVFOUE꧊秈؟٦غכⵃ欽׃זְ •  ٔ،ٕة؎ي꧊鎘ך䗳銲䚍כ植朐넝ֻזְ •  ⱔꞿ圓䧭׮罋ִילז׵׆醱꧟חז׷ •  4ך㸜㹀䠬חֶ⟣ׇ׃׋ְ

Slide 23

Slide 23 text

1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 4UPSF

Slide 24

Slide 24 text

4UPSF Ø  ה׶ִ֮׆4ח굲לׅ Ø  4ךغ؛حزכ劤殢嗚鏾דⴓֽגֶֻ •  غ؛حز⽃⡘ד،ؙإأ؝ٝزٗ٦ٕ〳腉 •  FYBNQMFDPNQSPEVDUJPOMPH Ø  ؟٦غ䕵ⶴⴽחؗ٦׾ⴓֽגֶֻ •  ⴽ؟٦غָ㟓ִג׮㸜䗰 •  FYBNQMFDPNQSPEVDUJPOMPHBQJ Ø  傈ⴽחؗ٦׾ⴓֽגֶֻ •  )JWFךػ٦ذ؍ءّٝ׾ⵃ欽ׅ׷捀 •  FYBNQMFDPNQSPEVDUJPOMPHBQJEU

Slide 25

Slide 25 text

1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 1SPDFTT

Slide 26

Slide 26 text

1SPDFTT Ø  ⥋걾ה㹋籐ך㢸꟦غحث •  盖椚؟٦غַ׵)BEPPQ)JWFך&.3׾饯⹛ •  'MVFOUE꧊秈؟٦غ׾ⵃ欽׃גְזְ捀稢ⴖ׸הז׏׋ؚٗؿ؋؎ ٕ׾㖇簭ծ穠さ )BEPPQכ稢ⴖ׸㼭ְׁؿ؋؎ٕךⳢ椚蕱䩛 •  ؚٗח鎸ꐮׁ׸גְ׷2VFSZ4USJOH׾6%'׾ⵃ欽׃ג+40/ח㢌䳔 •  鋅׷ץֹ鯥ד꧊鎘׃ג⥂㶷 •  ♳鎸Ⰻגך1SPDFTT׾)%'4חر٦ة׾衅הׁ׆4׾ⵃ欽׃ג㹋遤 •  剑穄涸ז꧊鎘⦼׾3%4ח呓秛 Ø  厫鮾ד鸞ְ儎꟦ؙؒٔ •  盖椚؟٦غַ׵)BEPPQ)JWF1SFTUPך&.3׾饯⹛ •  1SFTUPָ)JWFךًةأز، ذ٦ـٕ㹀纏 ׾⿫撑 •  ر٦ةכⰋג4♳ח֮׷

Slide 27

Slide 27 text

1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 7JTVBMJ[F

Slide 28

Slide 28 text

7JTVBMJ[F Ø  &.3ד꧊鎘׃׋ر٦ة׾.Z42-חٗ٦س Ø  盖椚؟٦غ♳ד⹛ֻ؟٦ؽأ׾ⵃ欽׃ג⦼׾〳鋔⻉ •  ًٝغ٦Ⰻ㆞ָずׄ⦼׾鋅ג侧⦼然钠 Ø  ⡭׏ג׷爡ⰻ؟٦غח鑐꿀涸ח&MBTUJDTFBSDI,JCBOB׾ 㼪Ⰵ •  ر٦ة׾䒚׶זָ׵ⴓ匿鯥׾罋ִ׋ְ儗ח⤑ⵃ

Slide 29

Slide 29 text

1PPSNBOˏT%BUB"OBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

Slide 30

Slide 30 text

:"(/*

Slide 31

Slide 31 text

ד׮䗳銲חז׏׋׵鷄⸇דֹ׷

Slide 32

Slide 32 text

鑥תזְ״ֲח׃גֶֻ

Slide 33

Slide 33 text

1PPSNBOˏT%BUB"OBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

Slide 34

Slide 34 text

3FGFSFODFT Ø  "84"NB[PO&.3#FTU1SBDUJDFT •  ؝ٖ׾铣׭ל荈ⴓ麦ך؝ٝذؙأزחさ׏׋&.3圓䧭ָ׻ַ׷կ )BEPPQךⰅꟌה׃ג׮葺ְךדכկ Ø  NJYJך鍑匿㛇湍ה"QBDIF)JWFדך+40/ػ٦؟ ך崞欽ך稱➜ •  +40/ד顕׭ג7JFXדذ٦ـٕ׏שֻ䪔ֲ،؎ر؍،׾顗׏׋կؚٗ ꧊鎘חꟼ׻׷➂麦ך؝ىُص؛٦ءّٝ؝أزծהְֲ嚊䙀׮顗׏׋կ Ø  #BUDI1SPDFTTJOHBOE4USFBN1SPDFTTJOHCZ42- •  ֿךز٦ؙ׾耀ְגⴓ匿㛇湍ח.11禸ؒٝآٝ׾ⵃ欽ׅ׷✲׾寸䠐կ *NQBMBה1SFTUP׾嫰鯰׃ծ4ח׮湫䱸ؙؒٔ׾䫎־׸׷1SFTUP׾㼪 Ⰵ׃׋կ *NQBMB׮如劍غ٦آّٝדכ4ח湫䱸ؙؒٔ䫎־׸׷׵׃ ְךד׉ך儗חⱄ䏝嗚鏾✮㹀

Slide 35

Slide 35 text

չⰅꟌ⟃♳պ׾湡䭷׃׋ 1ZUIPO&.3崞欽倯岀ךⰟ剣 ؚٗ꧊鎘

Slide 36

Slide 36 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 37

Slide 37 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 38

Slide 38 text

BXTDMJ Ø  ٔٔ٦أך7FSַ׵&.3堣腉ך1SFWJFX أذ٦ةأָ《׸ծ兦׸ג㸜㹀׃׋"1*ה׃גⵃ欽〳腉 Ø  ➙תדرؿ؋ؙز׌׏׋3VCZך&MBTUJD.BQ3FEVDFأؙ ٔفزַ׵⛦׶䳔ִ •  QJQד知⽃ח؎ٝأز٦ٕדֹ׷ •  ⟃⵸ַ׵BXTDMJ׾⢪׏ג׷ךדخ٦ٕ窟♧ •  (JU)VC♳דךꟚ涪ָ崞涪ד13׮⳿ׇ׷

Slide 39

Slide 39 text

8F-PWF1ZUIPO

Slide 40

Slide 40 text

$  mkvirtualenv  pycon-­‐emr-­‐dev   (pycon-­‐emr-­‐dev)$  pip  install  awscli   (pycon-­‐emr-­‐dev)$  mkdir  ~/.awscli   (pycon-­‐emr-­‐dev)$  cat  <<-­‐EOF  >>    ~/.awscli/config   [profile  development]   aws_access_key_id=   aws_secret_access_key=   region=ap-­‐northeast-­‐1   EOF   (pycon-­‐emr-­‐dev)$  cat  <<-­‐EOF  >>    $VIRTUAL_ENV/bin/activate   export  AWS_CONFIG_FILE=~/.awscli/config   export  AWS_DEFAULT_PROFILE=development   source  aws_zsh_completer.sh   EOF  

Slide 41

Slide 41 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 42

Slide 42 text

$  aws  emr  create-­‐cluster  -­‐-­‐ami-­‐version  3.1.1  \          -­‐-­‐name  'PyConJP  2014  (AMI  3.1.1  Hive)'  \          -­‐-­‐tags  Name=pycon-­‐jp-­‐emr  environment=development  \          -­‐-­‐ec2-­‐attributes  KeyName=yourkey          -­‐-­‐log-­‐uri  's3://yourbucket/jobflow_logs/'  \          -­‐-­‐no-­‐auto-­‐terminate  \          -­‐-­‐visible-­‐to-­‐all-­‐users  \          -­‐-­‐instance-­‐groups  file://./normal-­‐instance-­‐setup.json  \          -­‐-­‐applications  file://./app-­‐hive.json  

Slide 43

Slide 43 text

[      {            "Name":  "emr-­‐master",            "InstanceGroupType":  "MASTER",            "InstanceCount":  1,            "InstanceType":  "m1.medium"      },      {            "Name":  "emr-­‐core",            "InstanceGroupType":  "CORE",            "InstanceCount":  2,            "InstanceType":  "m1.medium"      }   ]   [      {          "Name":  "HIVE"      }   ]   OPSNBMJOTUBODFHSPVQKTPO BQQIJWFKTPO

Slide 44

Slide 44 text

SFTVMU {          "ClusterId":  "j-­‐8xxxxxxxxx"   }  

Slide 45

Slide 45 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 46

Slide 46 text

$  aws  emr  add-­‐steps  -­‐-­‐cluster-­‐id  j-­‐8xxxxxxxxx  \          -­‐-­‐steps  file://./hive-­‐sample-­‐step-­‐1.json  

Slide 47

Slide 47 text

[      {          "Args":  [              "-­‐f",  "s3n://yourbucket/hive-­‐script/sample01.hql",              "-­‐d",  "BUCKET_NAME=yourbucket",              "-­‐d",  "TARGET_DATE=20140818"          ],          "ActionOnFailure":  "CONTINUE",          "Name":  "Hive  Sample  Program  01",          "Type":  "HIVE"      },      {          "Args":  [              "-­‐f",  "s3n://yourbucket/hive-­‐script/sample02.hql",              "-­‐d",  "BUCKET_NAME=yourbucket",              "-­‐d",  "TARGET_DATE=20140818"          ],          "ActionOnFailure":  "CONTINUE",          "Name":  "Hive  Sample  Program  02",          "Type":  "HIVE"      }   ]   IJWFTBNQMFTUFQKTPO

Slide 48

Slide 48 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 49

Slide 49 text

$  aws  emr  add-­‐steps  -­‐-­‐cluster-­‐id  j-­‐8xxxxxxxxx  \          -­‐-­‐steps  file://./s3distcp-­‐sample-­‐step.json  

Slide 50

Slide 50 text

[      {          "Name":  "s3distcp  Sample",          "ActionOnFailure":  "CONTINUE",          "Jar":  "/home/hadoop/lib/emr-­‐s3distcp-­‐1.0.jar",          "Type":  "CUSTOM_JAR",          "Args":  [              "-­‐-­‐src",  "s3n://yourbucket/access_log/dt=20140818",              "-­‐-­‐dest",  "s3n://yourbucket/compressed_log/dt=20140818",              "-­‐-­‐groupBy",  ".*(nginx_access_log-­‐).*",              "-­‐-­‐targetSize",  "100",              "-­‐-­‐outputCodec",  "gzip"          ]      }   ]   TEJTUDQTBNQMFTUFQKTPO

Slide 51

Slide 51 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 52

Slide 52 text

$  aws  emr  create-­‐cluster  -­‐-­‐ami-­‐version  3.1.1  \          -­‐-­‐name  'PyConJP  2014  (AMI  3.1.1  Hive)'  \          -­‐-­‐tags  Name=pycon-­‐jp-­‐emr  environment=development  \          -­‐-­‐ec2-­‐attributes  KeyName=yourkey          -­‐-­‐log-­‐uri  's3://yourbucket/jobflow_logs/'  \          -­‐-­‐no-­‐auto-­‐terminate  \          -­‐-­‐visible-­‐to-­‐all-­‐users  \          -­‐-­‐instance-­‐groups  file://./normal-­‐instance-­‐setup.json  \          -­‐-­‐applications  file://./app-­‐hive-­‐with-­‐config.json  

Slide 53

Slide 53 text

[      {          "Args":  [              "-­‐-­‐hive-­‐site=s3://yourbucket/libs/config/hive-­‐site.xml"          ],          "Name":  "HIVE"      }   ]   BQQIJWFXJUIDPOHKTPO

Slide 54

Slide 54 text

                   hive.optimize.s3.query          true          Optimize  query  on  S3           IJWFTJUFYNM

Slide 55

Slide 55 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 56

Slide 56 text

$  aws  emr  create-­‐cluster  -­‐-­‐ami-­‐version  3.1.1  \          -­‐-­‐name  'PyConJP  2014  (AMI  3.1.1  Hive  +  Presto)'  \          -­‐-­‐tags  Name=pycon-­‐jp-­‐emr  environment=development  \          -­‐-­‐ec2-­‐attributes  KeyName=yourkey          -­‐-­‐log-­‐uri  's3://yourbucket/jobflow_logs/'  \          -­‐-­‐no-­‐auto-­‐terminate  \          -­‐-­‐visible-­‐to-­‐all-­‐users  \          -­‐-­‐instance-­‐groups  file://./normal-­‐instance-­‐setup.json  \          -­‐-­‐bootstrap-­‐actions  file://./bootstrap-­‐presto.json  \          -­‐-­‐applications  file://./app-­‐hive-­‐with-­‐config.json  

Slide 57

Slide 57 text

[      {          "Name":  "Install/Setup  Presto",          "Path":  "s3://yourbucket/libs/setup-­‐presto.rb",          "Args":  [              "-­‐-­‐task_memory",  "1GB",              "-­‐-­‐log-­‐level",  "DEGUB",              "-­‐-­‐version",  "0.75",              "-­‐-­‐presto-­‐repo-­‐url",  "http://central.maven.org/maven2/com/ facebook/presto/",              "-­‐-­‐sink-­‐buffer-­‐size",  "1GB",              "-­‐-­‐query-­‐max-­‐age",  "1h",              "-­‐-­‐jvm-­‐config",                "-­‐server  -­‐Xmx2G  -­‐XX:+UseConcMarkSweepGC  -­‐XX: +ExplicitGCInvokesConcurrent  -­‐XX:+CMSClassUnloadingEnabled  -­‐XX: +AggressiveOpts  -­‐XX:+HeapDumpOnOutOfMemoryError  -­‐ XX:OnOutOfMemoryError=kill  -­‐9  %p  -­‐XX:PermSize=150M  -­‐ XX:MaxPermSize=150M  -­‐XX:ReservedCodeCacheSize=150M  -­‐ Dhive.config.resources=/home/hadoop/conf/core-­‐site.xml,/home/ hadoop/conf/hdfs-­‐site.xml"          ]      }   ]  

Slide 58

Slide 58 text

Ø  TFUVQQSFTUPSC㹋䡾כ IUUQTHJUIVCDPN BXTMBCTFNSCPPUTUSBQBDUJPOTCMPCNBTUFS QSFTUPJOTUBMM Ø  "84ָ㹋꿀涸ח⳿׃ג׷1SFTUP׾&.3חⰅ׸׷捀 ך#PPUTUSBQأؙٔفز Ø  ".*PSדכ⹛ְ׋ֽוծ".*דכ ⹛ַזַ׏׋ )JWF)JWF Ø  5ISJGU4FSWJDFךه٦زָ殯ז׷׏שְ

Slide 59

Slide 59 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 60

Slide 60 text

Ø  .FUBTUPSFהכ)JWFךذ٦ـٕ㹀纏瘝ך䞔㜠׾⥂ 㶷׃גֶֻ㜥䨽ךֿה Ø  植㖈㢳ֻכ.Z42-ָⵃ欽ׁ׸גְ׷ Ø  ⡦׮鏣㹀׃זְה&.3ך؎ٝأةٝأך.Z42-ח ⥂㶷ׁ׸׷ Ø  .FUBTUPSF׾&.3㢩鿇ך%#ח鏣㹀׃גֶֻֿהדծ &.3甧׍♳־׷ꥷח%%-׾ⱄ䏝崧ׁזֻג׮葺ֻ ז׷ Ø  %#⩎ך4FDVSJUZ(SPVQ׾⥜姻ׅ׷䗳銲֮׶

Slide 61

Slide 61 text

               hive.optimize.s3.query          true          Optimize  query  on  S3                      javax.jdo.option.ConnectionURL          jdbc:mysql://hostname:3306/hive?createDatabaseIfNotExist=true          JDBC  connect  string  for  a  JDBC  metastore                      javax.jdo.option.ConnectionDriverName          com.mysql.jdbc.Driver          Driver  class  name  for  a  JDBC  metastore                      javax.jdo.option.ConnectionUserName          username          Username  to  use  against  metastore  database                      javax.jdo.option.ConnectionPassword          password          Password  to  use  against  metastore  database           BQQIJWFXJUIDPOHKTPO

Slide 62

Slide 62 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 63

Slide 63 text

Ø  1ZUIPOغحثⳢ椚ⰻד&.3׾饯⹛׃׋ְ✲׮֮׷ Ø  ׮׃ֻכ$FMFSZך5BTLה׃ג饯⹛׃׋ְהַ Ø  ׉ְֲ׏׋㜥さחכ1ZUIPOך⚥ַ׵&.3׾⢪ֲ✲ ׮〳腉 Ø  CPUPFNS׾ⵃ欽ׅ׷ Ø  BXTDMJⰻַ׵⤑ⵃז6UJMJUZ׾《׏גֹג⢪ֲך׮ ֮׶ַ׮

Slide 64

Slide 64 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 65

Slide 65 text

#  -­‐*-­‐  coding:  utf-­‐8  -­‐*-­‐   from  datetime  import  datetime   from  boto.emr  import  connect_to_region   from  boto.emr.step  import  InstallHiveStep           def  setup_emr():          #  need  to  export  AWS_ACCESS_KEY_ID  and  AWS_SECRET_ACCESS_KEY          #  as  environment  variables.          conn  =  connect_to_region('ap-­‐northeast-­‐1')          install_step  =  InstallHiveStep(hive_versions='0.11.0.2')              jobid  =  conn.run_jobflow(                  name='Create  EMR  [{}]'.format(datetime.today().strftime('%Y%m%d')),                  log_uri='s3://yourbucket/jobflow_logs/',                  ec2_keyname='your_key',                  master_instance_type='m1.medium',                  slave_instance_type='m1.medium',  num_instances=3,                  action_on_failure='TERMINATE_JOB_FLOW',  keep_alive=True,                  enable_debugging=False,                  hadoop_version='2.4.0',                  steps=[install_step],                  bootstrap_actions=[],                  instance_groups=None,                  additional_info=None,                  ami_version='3.1.1',                  api_params=None,                  visible_to_all_users=True,                  job_flow_role=None)              return  jobid           if  __name__  ==  '__main__':          jobflow_id  =  setup_emr()          print  "JobFlowID:  {}  started.".format(jobflow_id)  

Slide 66

Slide 66 text

Ø  "84ךؙٖرٝءٍٕכا٦أⰻחⰅ׸זְ✲ •  橆㞮㢌侧חⰅ׸׷׮װ׭׋倯ָ葺ְ •  ٗ٦ٕؕوءٝדذأز׃׋ְ㜥さכ䊺׬搀׃ַ •  &.3׾甧׍♳־׷&$ח➰♷ׅ׷*".3PMFדⵖ䖴

Slide 67

Slide 67 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN UPEPUIFGPMMPXJOH &YFDVUF )JWF2- &.3 VTF

Slide 68

Slide 68 text

 jobid  =  conn.run_jobflow(                  name='Create  EMR  and  Exec  hiveql  [{}]'.format(target_date),                  log_uri='s3://{}/jobflow_logs/'.format(bucket_name),                  ec2_keyname='your_key',                  master_instance_type='m1.medium',                  slave_instance_type='m1.medium',  num_instances=3,                  action_on_failure='TERMINATE_JOB_FLOW',  keep_alive=True,                  enable_debugging=False,                  hadoop_version='2.4.0',                  steps=[install_step],                  bootstrap_actions=[],                  instance_groups=None,                  additional_info=None,                  ami_version='3.1.1',                  api_params=None,                  visible_to_all_users=True,                  job_flow_role=None)              query_files  =  ['sample01.hql',  'sample02.hql']          hql_steps  =  []          for  query_file  in  query_files:                  hql_step  =  HiveStep(                          name='Executing  Query  [{}]'.format(query_file),                          hive_file='s3n://{0}/hive-­‐script/{1}'.format(                                  bucket_name,  query_file),                          hive_versions=hive_version,                          hive_args=['-­‐dTARGET_DATE={0}'.format(target_date),                                                '-­‐dBUCKET_NAME={0}'.format(bucket_name)])                  hql_steps.append(hql_step)              conn.add_jobflow_steps(jobid,  hql_steps)   ꞿֻז׏ג׃ת׏׋ךדꨜ㔲孡׌ֽ

Slide 69

Slide 69 text

BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH

Slide 70

Slide 70 text

Ø  غحثⳢ椚ח⣛㶷ꟼ⤘׾⡲׶׋ְ •  "ָ穄׻׏׋׵#ה$ず儗ח㹋遤ׅ׷ծ瘝 •  "ה#ָ穄׻׏׋׵$׾㹋遤ׅ׷ծ瘝 Ø  饯⹛儗꟦ך盖椚׾׮׏ה䩛鯪ח遤ְ׋ְ

Slide 71

Slide 71 text

•  IUUQTHJUIVCDPNTQPUJGZMVJHJ •  1ZUIPO醡ךػ؎فٓ؎ٝ盖椚ؿٖ٦يٙ٦ؙ •  )BEPPQ4USFBNJOH׾ⵃ欽׃׋.BQ3FEVDFָ知⽃ח剅ֽ׷堣圓֮׶ •  1ZUIPOך؝٦س׌ֽד⣛㶷䚍鍑寸 •  ⣛㶷䚍〳鋔⻉ ⴽ؟٦ؽأה׃ג甧׍♳־ •  ⣛㶷䚍〳鋔⻉خ٦ٕכ钠鏾瘝稢ְַ堣腉כ搀ְ •  )JWF2-ך㹋遤ח㼎䘔׃גְ׷ •  1JHך㹋遤ח㼎䘔׃גְ׷ •  4ך乼⡲ח㼎䘔׃ג׷ •  植朐׌הؔ٦غ٦ٕؗ

Slide 72

Slide 72 text

•  盖椚歗꬗כ%KBOHP׾ⵃ欽 •  ず♧ך؟٦غדDFMFSZהDFMFSZCFBU׾饯⹛ •  EKBOHPDFMFSZ׾ⵃ欽׃ג暴㹀ةأؙ׾暴㹀ך儗꟦חُؗ٦חⰅ׸׷״ ֲח鏣㹀 •  DFMFSZCFBUָُؗ٦חⰅ׏׋ةأؙ׾䭪׏ג㹋遤׃גֻ׸׷ •  EKBOHPDFMFSZזֻג׮DFMFSZה%KBOHPכ鸬䵿דֹ׷ֽוծֿךأ؛ آُ٦ٕ堣腉ָ⤑ⵃזךדת׌⢪׏ג׷

Slide 73

Slide 73 text

3FGFSFODFT Ø  IUUQTHJUIVCDPNBXTBXTDMJ •  劤㹺ך项俱הا٦أ Ø  IUUQTHJUIVCDPNCPUPCPUP •  劤㹺ך项俱הا٦أ

Slide 74

Slide 74 text

,BONV 窫额⟗꟦⹫꧊⚥

Slide 75

Slide 75 text

ת׆כֶ鑧׌ֽד׮

Slide 76

Slide 76 text

IUUQTXXXXBOUFEMZDPNQSPKFDUT

Slide 77

Slide 77 text

蕯䎁➙㔐ך1Z$PO彊⪒酅鑧

Slide 78

Slide 78 text

Ø  ⯓鹈ꆃ刑傈儗挿ד遤ֻ׵ְך.BSLEPXO Ø  4MJEFMFTTח䮋䨌׃״ֲה׃׋ Ø  爡ⰻדٖؽُ٦⠓㹋倵

Slide 79

Slide 79 text

Slide 80

Slide 80 text

Slide 81

Slide 81 text

➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂ խ劤䔲ח֮׶ָהֲ׀ְׂת׃׋խ :?:?:?:?:?:?:?:?:?:?:?:?:?:?:

Slide 82

Slide 82 text

Ø  ⴱ׭ג䪮遭禸ך涪邌׃׋ Ø  ➬✲דװ׏גֹ׋✲׾תה׭׷ְְ堣⠓ Ø  ➭ך倯׋׍ָ➬✲׃ג׷儗ח罋ִגְ׷✲׾濼׶׋ְ Ø  ➭ך⠓爡ך圓䧭ָז׈׉ך圓䧭׾ה׏גְ׷ךַ濼׶׋ְ

Slide 83

Slide 83 text

(PBM Ø  չז׈׉ך圓䧭זךַպח搊挿׾縧ְ׋✲⢽ךⰟ剣 Ø  չⰅꟌ⟃♳պ׾湡䭷׃׋1ZUIPO&.3崞欽倯岀ךⰟ剣 涺ׁ׿ך鑧׾耀ֹ׋ְ 荈ⴓָ

Slide 84

Slide 84 text

➙㔐כ涪邌ך堣⠓׾갥ֹ

Slide 85

Slide 85 text

➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂ խ劤䔲ח֮׶ָהֲ׀ְׂת׃׋խ :?:?:?:?:?:?:?:?:?:?:?:?:?:?: