PyCon JP 2014 Python + Hive on AWS EMRで貧者のログ集計

36e72b299b441378e41b6c445296b959?s=47 Akira Chiku
September 14, 2014

PyCon JP 2014 Python + Hive on AWS EMRで貧者のログ集計

36e72b299b441378e41b6c445296b959?s=128

Akira Chiku

September 14, 2014
Tweet

Transcript

  1.  1ZUIPO "84&.3 顆罏ךؚٗ꧊鎘 1Z$PO+1  "LJSB$IJLV

  2. BDIJLV    /BNF"LJSB$IJLV 5XJUUFS!@BDIJLV (JU)VC!BDIJLV  馯㄂拦莸 "LJSB$IJLV'JSFד嗚稊

     耵噟ؒٝآص،!LBONV
  3. (PBM  Ø  չז׈׉ך圓䧭זךַպח搊挿׾縧ְ׋✲⢽ךⰟ剣 Ø  չⰅꟌ⟃♳պ׾湡䭷׃׋1ZUIPO &.3崞欽倯岀ךⰟ剣 涺ׁ׿ך鑧׾耀ֹ׋ְ 荈ⴓָ 

  4. ,BONV #VTJOFTT   Ø  ؕ٦س⠓爡ה⼿噟׃׋寸幥ر٦ةⴓ匿 Ø  ؕ٦سח秡בֻؙ٦هٝךꂁ⥋ Ø  $BSE-JOLFE0FS

    $-0 
  5. $BSE-JOLFE0FS   

  6. $BSE-JOLFE0FS    չ"䏄ךؙ٦هٝؒٝزٔ٦׃ ׋ؕ٦سד顠ְ暟ׅ׸לه ؎ٝز؜حزկպ չְֲֲֶֿ㹏ׁ׿ח ְֲֲֿؙ٦هٝ⳿׃׋ְպ 

    չְֲֲֿ飑顠⫘ぢךֶ㹏ׁ׿ך倯 ְְָךדכպ  չֿ׿ז穠卓׌׏׋ךדծ如㔐כֿ ְֲֲإًؚٝزⴖ׶ת׃׳ֲպ ؕ٦س ⠓爡 ,BONV ؕ٦س ⠓㆞ ֶ䏄
  7. 2VJDL4VSWFZ  Ø  ؚٗⴓ匿חꟼ׻׏גְ׵׏׃ׯ׷倯 Ø  )BEPPQ⢪׏ג׵׏׃ׯ׷倯 Ø  )JWF⢪׏ג׵׏׃ׯ׷倯 Ø  &.3⢪׏ג׵׏׃ׯ׷倯

  8.  չז׈׉ך圓䧭זךַպ ח搊挿׾䔲ג׋✲⢽ךⰟ剣

  9.  䒦爡ך⵸䲿

  10.  ֿךز٦ؙךة؎زٕ

  11.  顆罏ךؚٗ꧊鎘

  12.  1PPSNBOˏT  Ø  ➙֮׷植朐׾⯋ח Ø  満ٔا٦أ ➂儗꟦穗꿀 ד湡涸׾麦䧭ׅ׷㪦⹲ Ø 

    湡涸׾麦䧭ׅ׷أؾ٦س׾〳腉זꣲ׶♳־׷㪦⹲ Ø  搀欽ז佄⳿׾鼘ֽ׷㪦⹲
  13. ,BONV &OHJOFFS5FBN   NBLJ $&0&OHJOFFS  @JEFZVUB %FTJHOFS 

    NPRBEB &OHJOFFS  @BDIJLV &OHJOFFS  爡ꞿ噟灇瑔Ꟛ涪 رؠ؎ٝؿٗٝز أوم،فؚٔٗⴓ匿 ؿٗٝزغحؙؒٝس ؎ٝؿٓأوم،فٔ غحؙؒٝس؎ٝؿٓ ⴓ匿㛇湍ؚٗⴓ匿㼎ػ٦زش٦璞〡
  14.  3FRVJSFNFOUT Ø  ֮׷玎䏝ךꆀחז׷ر٦ة׾أزٖأ搀ֻ꧊鎘׃׋ְ •  "WF(EBZ .BY(EBZ ꬊ㖇簭  • 

    (# 剢 ꬊ㖇簭  •  ؟٦ؽأך䧭ꞿהⰟח㟓ִ׷鋅鴥׫ •  剢⽃⡘ד،سمحؙזؙؒٔ׮䫎־׋ְ Ø  爡ⰻח㣐鋉垷ر٦ة׾Ⳣ椚ׅ׷濼鋅׾顕׭׋ְ •  չل٦أꂁⴓ׾׃׋♳דպ濼鋅׾顕׭׷ •  㢩鿇ח⳿׃חְֻإٝءذ؍ـזر٦ة׮㶷㖈 Ø  麊欽؝أزⴱ劍䫎项׾⡚ֻ䫇ִ׋ְ
  15.  /PU3FRVJSFNFOUT Ø  ⴓ匿ָٔ،ٕة؎يד֮׷䗳銲䚍כ植朐넝ֻזְ Ø  ،سمحؙⴓ匿㛇湍ך؟٦ؽأٖكٕכ寸׃ג넝ֻזְ •  兛鸐ךغحثⳢ椚כ衅׍גכ꼽湡׌ֽו •  䌢חⵃ欽〳腉ז朐䡾חז׏גְזֻג׮葺ְ

    •  ⵃ欽כ爡ⰻחꣲ㹀ׁ׸גְ׷ Ø  ׋׌׃ծ♳鎸ָ3FRVJSFNFOUTחז׷〳腉䚍כ⼧ⴓ剣׷
  16.  "NB[PO&MBTUJD.BQ3FEVDF

  17.  Ø  侧֮׷"84؟٦ؽأךֲ׍ך♧א Ø  )BEPPQװ)BEPPQؒ؝ءأذيⰻך48ָر ؿٕؓزדⵃ欽〳腉 Ø  "1*ד饯⹛ծ+PCך㹋遤ծ⨡姺׾乼⡲〳腉 Ø  ٌصةؚٔٝ瘝׮״׃זח㹋倵׃גֻ׸׷

    Ø  4׾)%'4ך剏׶חⵃ欽〳腉 Ø  ؙٓأةך〴侧㢌刿ָ㺁僒 "84&.3
  18.  "SDIJUFDUVSF 盖椚؟٦غ ؙ٦هٝ ꂁ⥋؟٦غ ؙ٦هٝ ꂁ⥋؟٦غ •  ꂁ⥋؟٦غ♳ך'MVFOUEדؚٗ꧊《 • 

    VFOUETQMVHJOד ꧊׃׋ؚٗ׾ 4♳ח⥂㶷 •  &.3♳ך)JWFדؚٗ׾⸇䊨ծ꧊鎘 •  ꧊鎘⦼׾3%4ח⥂㶷׃ג〳鋔⻉
  19.  %BUB"OBMZTJT'MPX CZUBHPNPSJT  1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT

    7JTVBMJ[F ⳿ⰩIUUQXXXTMJEFTIBSFOFUUBHPNPSJTIBOEMJOHOPUTPCJHEBUB
  20.  1PPSNBOˏT%BUB"OBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

  21.  $PMMFDU 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

  22.  $PMMFDU Ø  ؙ٦هٝꂁ⥋؟٦غַ׵'MVFOUE VFOUET QMVHJO׾ⵃ欽׃גؚٗ׾굲לׅ Ø  굲לؚׅٗכِ٦ؠך،ؙءّٝ׾2VFSZ4USJOH חろ׭ג굲לׅ • 

    醱꧟ז+40/כ굲לׁ׆ծ2VFSZ4USJOHח䞔㜠鯹ׇ׷ •  )JWFדך꧊鎘儗חⰋג+40/ח㢌䳔 •  IUUQTFYBNQMFDPNCFBDPO TVCPCKDPVQPOBDUJPODMJDLDJE  Ø  'MVFOUE꧊秈؟٦غכⵃ欽׃זְ •  ٔ،ٕة؎ي꧊鎘ך䗳銲䚍כ植朐넝ֻזְ •  ⱔꞿ圓䧭׮罋ִילז׵׆醱꧟חז׷ •  4ך㸜㹀䠬חֶ⟣ׇ׃׋ְ
  23.  1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 4UPSF

  24.  4UPSF Ø  ה׶ִ֮׆4ח굲לׅ Ø  4ךغ؛حزכ劤殢嗚鏾דⴓֽגֶֻ •  غ؛حز⽃⡘ד،ؙإأ؝ٝزٗ٦ٕ〳腉 •  FYBNQMFDPNQSPEVDUJPOMPH

    Ø  ؟٦غ䕵ⶴⴽחؗ٦׾ⴓֽגֶֻ •  ⴽ؟٦غָ㟓ִג׮㸜䗰 •  FYBNQMFDPNQSPEVDUJPOMPHBQJ Ø  傈ⴽחؗ٦׾ⴓֽגֶֻ •  )JWFךػ٦ذ؍ءّٝ׾ⵃ欽ׅ׷捀 •  FYBNQMFDPNQSPEVDUJPOMPHBQJEU
  25.  1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 1SPDFTT

  26.  1SPDFTT Ø  ⥋걾ה㹋籐ך㢸꟦غحث •  盖椚؟٦غַ׵)BEPPQ )JWFך&.3׾饯⹛ •  'MVFOUE꧊秈؟٦غ׾ⵃ欽׃גְזְ捀稢ⴖ׸הז׏׋ؚٗؿ؋؎ ٕ׾㖇簭ծ穠さ

    )BEPPQכ稢ⴖ׸㼭ְׁؿ؋؎ٕךⳢ椚蕱䩛  •  ؚٗח鎸ꐮׁ׸גְ׷2VFSZ4USJOH׾6%'׾ⵃ欽׃ג+40/ח㢌䳔 •  鋅׷ץֹ鯥ד꧊鎘׃ג⥂㶷 •  ♳鎸Ⰻגך1SPDFTT׾)%'4חر٦ة׾衅הׁ׆4׾ⵃ欽׃ג㹋遤 •  剑穄涸ז꧊鎘⦼׾3%4ח呓秛 Ø  厫鮾ד鸞ְ儎꟦ؙؒٔ •  盖椚؟٦غַ׵)BEPPQ )JWF 1SFTUPך&.3׾饯⹛ •  1SFTUPָ)JWFךًةأز، ذ٦ـٕ㹀纏 ׾⿫撑 •  ر٦ةכⰋג4♳ח֮׷
  27.  1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 7JTVBMJ[F

  28.  7JTVBMJ[F Ø  &.3ד꧊鎘׃׋ر٦ة׾.Z42-חٗ٦س Ø  盖椚؟٦غ♳ד⹛ֻ؟٦ؽأ׾ⵃ欽׃ג⦼׾〳鋔⻉ •  ًٝغ٦Ⰻ㆞ָずׄ⦼׾鋅ג侧⦼然钠 Ø  ⡭׏ג׷爡ⰻ؟٦غח鑐꿀涸ח&MBTUJDTFBSDI

    ,JCBOB׾ 㼪Ⰵ •  ر٦ة׾䒚׶זָ׵ⴓ匿鯥׾罋ִ׋ְ儗ח⤑ⵃ
  29.  1PPSNBOˏT%BUB"OBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

  30.  :"(/*

  31.  ד׮䗳銲חז׏׋׵鷄⸇דֹ׷

  32.  鑥תזְ״ֲח׃גֶֻ

  33.  1PPSNBOˏT%BUB"OBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F

  34.  3FGFSFODFT Ø  "84"NB[PO&.3#FTU1SBDUJDFT •  ؝ٖ׾铣׭ל荈ⴓ麦ך؝ٝذؙأزחさ׏׋&.3圓䧭ָ׻ַ׷կ )BEPPQךⰅꟌה׃ג׮葺ְךדכկ Ø  NJYJך鍑匿㛇湍ה"QBDIF)JWFדך+40/ػ٦؟ ך崞欽ך稱➜

    •  +40/ד顕׭ג7JFXדذ٦ـٕ׏שֻ䪔ֲ،؎ر؍،׾顗׏׋կؚٗ ꧊鎘חꟼ׻׷➂麦ך؝ىُص؛٦ءّٝ؝أزծהְֲ嚊䙀׮顗׏׋կ Ø  #BUDI1SPDFTTJOHBOE4USFBN1SPDFTTJOHCZ42- •  ֿךز٦ؙ׾耀ְגⴓ匿㛇湍ח.11禸ؒٝآٝ׾ⵃ欽ׅ׷✲׾寸䠐կ *NQBMBה1SFTUP׾嫰鯰׃ծ4ח׮湫䱸ؙؒٔ׾䫎־׸׷1SFTUP׾㼪 Ⰵ׃׋կ *NQBMB׮如劍غ٦آّٝדכ4ח湫䱸ؙؒٔ䫎־׸׷׵׃ ְךד׉ך儗חⱄ䏝嗚鏾✮㹀 
  35.  չⰅꟌ⟃♳պ׾湡䭷׃׋ 1ZUIPO &.3崞欽倯岀ךⰟ剣 ؚٗ꧊鎘 

  36.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  37.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  38.  BXTDMJ Ø  ٔٔ٦أך7FSַ׵&.3堣腉ך1SFWJFX أذ٦ةأָ《׸ծ兦׸ג㸜㹀׃׋"1*ה׃גⵃ欽〳腉 Ø  ➙תדرؿ؋ؙز׌׏׋3VCZך&MBTUJD.BQ3FEVDFأؙ ٔفزַ׵⛦׶䳔ִ •  QJQד知⽃ח؎ٝأز٦ٕדֹ׷

    •  ⟃⵸ַ׵BXTDMJ׾⢪׏ג׷ךדخ٦ٕ窟♧ •  (JU)VC♳דךꟚ涪ָ崞涪ד13׮⳿ׇ׷
  39.  8F-PWF1ZUIPO

  40.  $  mkvirtualenv  pycon-­‐emr-­‐dev   (pycon-­‐emr-­‐dev)$  pip  install  awscli  

    (pycon-­‐emr-­‐dev)$  mkdir  ~/.awscli   (pycon-­‐emr-­‐dev)$  cat  <<-­‐EOF  >>    ~/.awscli/config   [profile  development]   aws_access_key_id=<development_access_key>   aws_secret_access_key=<development_secret_key>   region=ap-­‐northeast-­‐1   EOF   (pycon-­‐emr-­‐dev)$  cat  <<-­‐EOF  >>    $VIRTUAL_ENV/bin/activate   export  AWS_CONFIG_FILE=~/.awscli/config   export  AWS_DEFAULT_PROFILE=development   source  aws_zsh_completer.sh   EOF  
  41.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  42.  $  aws  emr  create-­‐cluster  -­‐-­‐ami-­‐version  3.1.1  \    

         -­‐-­‐name  'PyConJP  2014  (AMI  3.1.1  Hive)'  \          -­‐-­‐tags  Name=pycon-­‐jp-­‐emr  environment=development  \          -­‐-­‐ec2-­‐attributes  KeyName=yourkey          -­‐-­‐log-­‐uri  's3://yourbucket/jobflow_logs/'  \          -­‐-­‐no-­‐auto-­‐terminate  \          -­‐-­‐visible-­‐to-­‐all-­‐users  \          -­‐-­‐instance-­‐groups  file://./normal-­‐instance-­‐setup.json  \          -­‐-­‐applications  file://./app-­‐hive.json  
  43.  [      {          

     "Name":  "emr-­‐master",            "InstanceGroupType":  "MASTER",            "InstanceCount":  1,            "InstanceType":  "m1.medium"      },      {            "Name":  "emr-­‐core",            "InstanceGroupType":  "CORE",            "InstanceCount":  2,            "InstanceType":  "m1.medium"      }   ]   [      {          "Name":  "HIVE"      }   ]   OPSNBMJOTUBODFHSPVQKTPO BQQIJWFKTPO
  44.  SFTVMU {          "ClusterId":  "j-­‐8xxxxxxxxx"  

    }  
  45.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  46.  $  aws  emr  add-­‐steps  -­‐-­‐cluster-­‐id  j-­‐8xxxxxxxxx  \    

         -­‐-­‐steps  file://./hive-­‐sample-­‐step-­‐1.json  
  47.  [      {          "Args":

     [              "-­‐f",  "s3n://yourbucket/hive-­‐script/sample01.hql",              "-­‐d",  "BUCKET_NAME=yourbucket",              "-­‐d",  "TARGET_DATE=20140818"          ],          "ActionOnFailure":  "CONTINUE",          "Name":  "Hive  Sample  Program  01",          "Type":  "HIVE"      },      {          "Args":  [              "-­‐f",  "s3n://yourbucket/hive-­‐script/sample02.hql",              "-­‐d",  "BUCKET_NAME=yourbucket",              "-­‐d",  "TARGET_DATE=20140818"          ],          "ActionOnFailure":  "CONTINUE",          "Name":  "Hive  Sample  Program  02",          "Type":  "HIVE"      }   ]   IJWFTBNQMFTUFQKTPO
  48.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  49.  $  aws  emr  add-­‐steps  -­‐-­‐cluster-­‐id  j-­‐8xxxxxxxxx  \    

         -­‐-­‐steps  file://./s3distcp-­‐sample-­‐step.json  
  50.  [      {          "Name":

     "s3distcp  Sample",          "ActionOnFailure":  "CONTINUE",          "Jar":  "/home/hadoop/lib/emr-­‐s3distcp-­‐1.0.jar",          "Type":  "CUSTOM_JAR",          "Args":  [              "-­‐-­‐src",  "s3n://yourbucket/access_log/dt=20140818",              "-­‐-­‐dest",  "s3n://yourbucket/compressed_log/dt=20140818",              "-­‐-­‐groupBy",  ".*(nginx_access_log-­‐).*",              "-­‐-­‐targetSize",  "100",              "-­‐-­‐outputCodec",  "gzip"          ]      }   ]   TEJTUDQTBNQMFTUFQKTPO
  51.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  52.  $  aws  emr  create-­‐cluster  -­‐-­‐ami-­‐version  3.1.1  \    

         -­‐-­‐name  'PyConJP  2014  (AMI  3.1.1  Hive)'  \          -­‐-­‐tags  Name=pycon-­‐jp-­‐emr  environment=development  \          -­‐-­‐ec2-­‐attributes  KeyName=yourkey          -­‐-­‐log-­‐uri  's3://yourbucket/jobflow_logs/'  \          -­‐-­‐no-­‐auto-­‐terminate  \          -­‐-­‐visible-­‐to-­‐all-­‐users  \          -­‐-­‐instance-­‐groups  file://./normal-­‐instance-­‐setup.json  \          -­‐-­‐applications  file://./app-­‐hive-­‐with-­‐config.json  
  53.  [      {          "Args":

     [              "-­‐-­‐hive-­‐site=s3://yourbucket/libs/config/hive-­‐site.xml"          ],          "Name":  "HIVE"      }   ]   BQQIJWFXJUIDPOHKTPO
  54.  <?xml  version="1.0"?>   <?xml-­‐stylesheet  type="text/xsl"  href="configuration.xsl"?>   <configuration>  

       <property>          <name>hive.optimize.s3.query</name>          <value>true</value>          <description>Optimize  query  on  S3</description>      </property>   </configuration>   IJWFTJUFYNM
  55.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  56.  $  aws  emr  create-­‐cluster  -­‐-­‐ami-­‐version  3.1.1  \    

         -­‐-­‐name  'PyConJP  2014  (AMI  3.1.1  Hive  +  Presto)'  \          -­‐-­‐tags  Name=pycon-­‐jp-­‐emr  environment=development  \          -­‐-­‐ec2-­‐attributes  KeyName=yourkey          -­‐-­‐log-­‐uri  's3://yourbucket/jobflow_logs/'  \          -­‐-­‐no-­‐auto-­‐terminate  \          -­‐-­‐visible-­‐to-­‐all-­‐users  \          -­‐-­‐instance-­‐groups  file://./normal-­‐instance-­‐setup.json  \          -­‐-­‐bootstrap-­‐actions  file://./bootstrap-­‐presto.json  \          -­‐-­‐applications  file://./app-­‐hive-­‐with-­‐config.json  
  57.  [      {          "Name":

     "Install/Setup  Presto",          "Path":  "s3://yourbucket/libs/setup-­‐presto.rb",          "Args":  [              "-­‐-­‐task_memory",  "1GB",              "-­‐-­‐log-­‐level",  "DEGUB",              "-­‐-­‐version",  "0.75",              "-­‐-­‐presto-­‐repo-­‐url",  "http://central.maven.org/maven2/com/ facebook/presto/",              "-­‐-­‐sink-­‐buffer-­‐size",  "1GB",              "-­‐-­‐query-­‐max-­‐age",  "1h",              "-­‐-­‐jvm-­‐config",                "-­‐server  -­‐Xmx2G  -­‐XX:+UseConcMarkSweepGC  -­‐XX: +ExplicitGCInvokesConcurrent  -­‐XX:+CMSClassUnloadingEnabled  -­‐XX: +AggressiveOpts  -­‐XX:+HeapDumpOnOutOfMemoryError  -­‐ XX:OnOutOfMemoryError=kill  -­‐9  %p  -­‐XX:PermSize=150M  -­‐ XX:MaxPermSize=150M  -­‐XX:ReservedCodeCacheSize=150M  -­‐ Dhive.config.resources=/home/hadoop/conf/core-­‐site.xml,/home/ hadoop/conf/hdfs-­‐site.xml"          ]      }   ]  
  58.  Ø  TFUVQQSFTUPSC㹋䡾כ IUUQTHJUIVCDPN BXTMBCTFNSCPPUTUSBQBDUJPOTCMPCNBTUFS QSFTUPJOTUBMM  Ø  "84ָ㹋꿀涸ח⳿׃ג׷1SFTUP׾&.3חⰅ׸׷捀 ך#PPUTUSBQأؙٔفز

    Ø  ".*PSדכ⹛ְ׋ֽוծ".*דכ ⹛ַזַ׏׋ )JWF)JWF  Ø  5ISJGU4FSWJDFךه٦زָ殯ז׷׏שְ
  59.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  60.  Ø  .FUBTUPSFהכ)JWFךذ٦ـٕ㹀纏瘝ך䞔㜠׾⥂ 㶷׃גֶֻ㜥䨽ךֿה Ø  植㖈㢳ֻכ.Z42-ָⵃ欽ׁ׸גְ׷ Ø  ⡦׮鏣㹀׃זְה&.3ך؎ٝأةٝأך.Z42-ח ⥂㶷ׁ׸׷ Ø 

    .FUBTUPSF׾&.3㢩鿇ך%#ח鏣㹀׃גֶֻֿהדծ &.3甧׍♳־׷ꥷח%%-׾ⱄ䏝崧ׁזֻג׮葺ֻ ז׷ Ø  %#⩎ך4FDVSJUZ(SPVQ׾⥜姻ׅ׷䗳銲֮׶
  61.  <configuration>      <property>          <name>hive.optimize.s3.query</name>

             <value>true</value>          <description>Optimize  query  on  S3</description>      </property>      <property>          <name>javax.jdo.option.ConnectionURL</name>          <value>jdbc:mysql://hostname:3306/hive?createDatabaseIfNotExist=true</value>          <description>JDBC  connect  string  for  a  JDBC  metastore</description>      </property>      <property>          <name>javax.jdo.option.ConnectionDriverName</name>          <value>com.mysql.jdbc.Driver</value>          <description>Driver  class  name  for  a  JDBC  metastore</description>      </property>      <property>          <name>javax.jdo.option.ConnectionUserName</name>          <value>username</value>          <description>Username  to  use  against  metastore  database</description>      </property>      <property>          <name>javax.jdo.option.ConnectionPassword</name>          <value>password</value>          <description>Password  to  use  against  metastore  database</description>      </property>   </configuration>   BQQIJWFXJUIDPOHKTPO
  62.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  63.  Ø  1ZUIPOغحثⳢ椚ⰻד&.3׾饯⹛׃׋ְ✲׮֮׷ Ø  ׮׃ֻכ$FMFSZך5BTLה׃ג饯⹛׃׋ְהַ Ø  ׉ְֲ׏׋㜥さחכ1ZUIPOך⚥ַ׵&.3׾⢪ֲ✲ ׮〳腉 Ø  CPUPFNS׾ⵃ欽ׅ׷

    Ø  BXTDMJⰻַ׵⤑ⵃז6UJMJUZ׾《׏גֹג⢪ֲך׮ ֮׶ַ׮
  64.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  65.  #  -­‐*-­‐  coding:  utf-­‐8  -­‐*-­‐   from  datetime  import

     datetime   from  boto.emr  import  connect_to_region   from  boto.emr.step  import  InstallHiveStep           def  setup_emr():          #  need  to  export  AWS_ACCESS_KEY_ID  and  AWS_SECRET_ACCESS_KEY          #  as  environment  variables.          conn  =  connect_to_region('ap-­‐northeast-­‐1')          install_step  =  InstallHiveStep(hive_versions='0.11.0.2')              jobid  =  conn.run_jobflow(                  name='Create  EMR  [{}]'.format(datetime.today().strftime('%Y%m%d')),                  log_uri='s3://yourbucket/jobflow_logs/',                  ec2_keyname='your_key',                  master_instance_type='m1.medium',                  slave_instance_type='m1.medium',  num_instances=3,                  action_on_failure='TERMINATE_JOB_FLOW',  keep_alive=True,                  enable_debugging=False,                  hadoop_version='2.4.0',                  steps=[install_step],                  bootstrap_actions=[],                  instance_groups=None,                  additional_info=None,                  ami_version='3.1.1',                  api_params=None,                  visible_to_all_users=True,                  job_flow_role=None)              return  jobid           if  __name__  ==  '__main__':          jobflow_id  =  setup_emr()          print  "JobFlowID:  {}  started.".format(jobflow_id)  
  66.  Ø  "84ךؙٖرٝءٍٕכا٦أⰻחⰅ׸זְ✲ •  橆㞮㢌侧חⰅ׸׷׮װ׭׋倯ָ葺ְ •  ٗ٦ٕؕوءٝדذأز׃׋ְ㜥さכ䊺׬搀׃ַ •  &.3׾甧׍♳־׷&$ח➰♷ׅ׷*".3PMFדⵖ䖴

  67.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN UPEPUIFGPMMPXJOH &YFDVUF )JWF2- &.3 VTF
  68.   jobid  =  conn.run_jobflow(            

         name='Create  EMR  and  Exec  hiveql  [{}]'.format(target_date),                  log_uri='s3://{}/jobflow_logs/'.format(bucket_name),                  ec2_keyname='your_key',                  master_instance_type='m1.medium',                  slave_instance_type='m1.medium',  num_instances=3,                  action_on_failure='TERMINATE_JOB_FLOW',  keep_alive=True,                  enable_debugging=False,                  hadoop_version='2.4.0',                  steps=[install_step],                  bootstrap_actions=[],                  instance_groups=None,                  additional_info=None,                  ami_version='3.1.1',                  api_params=None,                  visible_to_all_users=True,                  job_flow_role=None)              query_files  =  ['sample01.hql',  'sample02.hql']          hql_steps  =  []          for  query_file  in  query_files:                  hql_step  =  HiveStep(                          name='Executing  Query  [{}]'.format(query_file),                          hive_file='s3n://{0}/hive-­‐script/{1}'.format(                                  bucket_name,  query_file),                          hive_versions=hive_version,                          hive_args=['-­‐dTARGET_DATE={0}'.format(target_date),                                                '-­‐dBUCKET_NAME={0}'.format(bucket_name)])                  hql_steps.append(hql_step)              conn.add_jobflow_steps(jobid,  hql_steps)   ꞿֻז׏ג׃ת׏׋ךדꨜ㔲孡׌ֽ
  69.  BXTDMJ &YFDVUF )JWF2- &YFDVUF TEJTUDQ $POH :PVS&.3 #PPUTUSQ 1SFTUP

    $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN &YFDVUF )JWF2- &.3 VTF UPEPUIFGPMMPXJOH
  70.  Ø  غحثⳢ椚ח⣛㶷ꟼ⤘׾⡲׶׋ְ •  "ָ穄׻׏׋׵#ה$ず儗ח㹋遤ׅ׷ծ瘝 •  "ה#ָ穄׻׏׋׵$׾㹋遤ׅ׷ծ瘝 Ø  饯⹛儗꟦ך盖椚׾׮׏ה䩛鯪ח遤ְ׋ְ

  71.  •  IUUQTHJUIVCDPNTQPUJGZMVJHJ •  1ZUIPO醡ךػ؎فٓ؎ٝ盖椚ؿٖ٦يٙ٦ؙ •  )BEPPQ4USFBNJOH׾ⵃ欽׃׋.BQ3FEVDFָ知⽃ח剅ֽ׷堣圓֮׶ •  1ZUIPOך؝٦س׌ֽד⣛㶷䚍鍑寸 • 

    ⣛㶷䚍〳鋔⻉ ⴽ؟٦ؽأה׃ג甧׍♳־  •  ⣛㶷䚍〳鋔⻉خ٦ٕכ钠鏾瘝稢ְַ堣腉כ搀ְ •  )JWF2-ך㹋遤ח㼎䘔׃גְ׷ •  1JHך㹋遤ח㼎䘔׃גְ׷ •  4ך乼⡲ח㼎䘔׃ג׷ •  植朐׌הؔ٦غ٦ٕؗ
  72.  •  盖椚歗꬗כ%KBOHP׾ⵃ欽 •  ず♧ך؟٦غדDFMFSZהDFMFSZCFBU׾饯⹛ •  EKBOHPDFMFSZ׾ⵃ欽׃ג暴㹀ةأؙ׾暴㹀ך儗꟦חُؗ٦חⰅ׸׷״ ֲח鏣㹀 •  DFMFSZCFBUָُؗ٦חⰅ׏׋ةأؙ׾䭪׏ג㹋遤׃גֻ׸׷

    •  EKBOHPDFMFSZזֻג׮DFMFSZה%KBOHPכ鸬䵿דֹ׷ֽוծֿךأ؛ آُ٦ٕ堣腉ָ⤑ⵃזךדת׌⢪׏ג׷
  73.  3FGFSFODFT Ø  IUUQTHJUIVCDPNBXTBXTDMJ •  劤㹺ך项俱הا٦أ Ø  IUUQTHJUIVCDPNCPUPCPUP •  劤㹺ך项俱הا٦أ

  74.  ,BONV 窫额⟗꟦⹫꧊⚥

  75.  ת׆כֶ鑧׌ֽד׮

  76.  IUUQTXXXXBOUFEMZDPNQSPKFDUT

  77.  蕯䎁➙㔐ך1Z$PO彊⪒酅鑧

  78.  Ø  ⯓鹈ꆃ刑傈儗挿ד遤ֻ׵ְך.BSLEPXO Ø  4MJEFMFTTח䮋䨌׃״ֲה׃׋ Ø  爡ⰻדٖؽُ٦⠓㹋倵

  79. 

  80. 

  81.  ➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂ խ劤䔲ח֮׶ָהֲ׀ְׂת׃׋խ :?:?:?:?:?:?:?:?:?:?:?:?:?:?:

  82.  Ø  ⴱ׭ג䪮遭禸ך涪邌׃׋ Ø  ➬✲דװ׏גֹ׋✲׾תה׭׷ְְ堣⠓ Ø  ➭ך倯׋׍ָ➬✲׃ג׷儗ח罋ִגְ׷✲׾濼׶׋ְ Ø  ➭ך⠓爡ך圓䧭ָז׈׉ך圓䧭׾ה׏גְ׷ךַ濼׶׋ְ

  83.  (PBM Ø  չז׈׉ך圓䧭זךַպח搊挿׾縧ְ׋✲⢽ךⰟ剣 Ø  չⰅꟌ⟃♳պ׾湡䭷׃׋1ZUIPO &.3崞欽倯岀ךⰟ剣 涺ׁ׿ך鑧׾耀ֹ׋ְ 荈ⴓָ 

  84.  ➙㔐כ涪邌ך堣⠓׾갥ֹ

  85.  ➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂➂ խ劤䔲ח֮׶ָהֲ׀ְׂת׃׋խ :?:?:?:?:?:?:?:?:?:?:?:?:?:?: