adhoc analysis apache spark

Ee2cf288bdebf40777f2e8e874ec285c?s=47 moyomot
June 23, 2015
900

adhoc analysis apache spark

Ee2cf288bdebf40777f2e8e874ec285c?s=128

moyomot

June 23, 2015
Tweet

Transcript

  1. ΞυϗοΫ෼ੳͰ׆༂͢Δ "QBDIF4QBSL 4QBSL$BTVBM5BML !NPZPNPU@

  2. ࣗݾ঺հ w ৿ຊ३࢘ "UTVTIJ.PSJNPUP  w ΤϯδχΞ!(VOPTZ w σʔλ෼ੳαʔόαΠυΞϓϦ։ൃ

  3. ຊ೔͓఻͍͑ͨ͜͠ͱ 4QBSLΛࢼ͢ʹ͸ &.3͕Φεεϝ αΫοͱ

  4. σʔλ෼ੳج൫ w ೔ʑͷσʔλ෼ੳͷओ࣠͸3FETIJGU w 42-ͰσʔλΛूܭͰ͖Δͷ͸େ͖ͳڧΈ "1*αʔόʔ ඞཁͳ෦෼Λ੔ܗ qVFOUE 4 3FETIJGU

    "1*αʔόʔ qVFOUE qVFOUE "1*αʔόʔ
  5. 3FETIJGUʹͳ͍σʔλΛ ෼ੳूܭ͢Δʹ͸ w ΞϓϦͷػೳ௥Ճ͸೔ʑ࣮ࢪ͍ͯ͠Δ w 3FETIJGUʹͳ͍σʔλΛҰ࣌తʹूܭ͍ͨ͠৔߹΋͋Δ w 4ʹ͸શϩά͕อଘ͞Ε͍ͯΔ

  6. 4QBSLͷ͕͜͜͏Ε͍͠ w 4ͱͷ਌࿨ੑ͕ߴ͍ w +40/෦෼ͷऔΓग़͕͠༰қ w 4QBSL42-ɺ%BUB'SBNF͕࢖͑Δ ※ίʔυΠϝʔδ val textFile

    = sc.textFile(“s3n://bucket-name/path/*/*.gz”) val json = textFile.map(…) val table = sqlContext.jsonRDD(json) … val data = sqlContext.sql(“SELECT … “)
  7. ෼ࢄج൫ͷӡ༻͸େม w ͪΐͬͱͨ͠ूܭʹ͍͍ͪͪߏஙͨ͘͠ͳ͍͠ɺ ෼ࢄج൫Λӡ༻͢Δͷ͸΋ͬͱେม BXTFNSDSFBUFDMVTUFSOBNF4QBSL$MVTUFSBNJWFSTJPO JOTUBODFUZQFNYMBSHFJOTUBODFDPVOUa FDBUUSJCVUFT,FZ/BNF.:,&:BQQMJDBUJPOT/BNF)JWFa CPPUTUSBQBDUJPOT1BUITTVQQPSUFMBTUJDNBQSFEVDFTQBSL JOTUBMMTQBSL IUUQTHJUIVCDPNBXTMBCTFNSCPPUTUSBQBDUJPOTUSFFNBTUFSTQBSL

    IUUQRJJUBDPNTIVOTVLFBJIBSBJUFNTCFEDGDG IUUQTBXTBNB[PODPNKQCMPHTBXTOFXBQBDIFTQBSLPOBNB[POFNS &MBTUJD .BQ3FEVDF w ਖ਼ࣜαϙʔτ͞Ε·ͨ͠   I’m happy to announce that Amazon EMR now supports Apache Spark.
  8. ·ͱΊ 4QBSLΛࢼ͢ʹ͸ &.3͕Φεεϝ αΫοͱ