Slide 1

Slide 1 text

ΞυϗοΫ෼ੳͰ׆༂͢Δ "QBDIF4QBSL 4QBSL$BTVBM5BML !NPZPNPU@

Slide 2

Slide 2 text

ࣗݾ঺հ w ৿ຊ३࢘ "UTVTIJ.PSJNPUP  w ΤϯδχΞ!(VOPTZ w σʔλ෼ੳαʔόαΠυΞϓϦ։ൃ

Slide 3

Slide 3 text

ຊ೔͓఻͍͑ͨ͜͠ͱ 4QBSLΛࢼ͢ʹ͸ &.3͕Φεεϝ αΫοͱ

Slide 4

Slide 4 text

σʔλ෼ੳج൫ w ೔ʑͷσʔλ෼ੳͷओ࣠͸3FETIJGU w 42-ͰσʔλΛूܭͰ͖Δͷ͸େ͖ͳڧΈ "1*αʔόʔ ඞཁͳ෦෼Λ੔ܗ qVFOUE 4 3FETIJGU "1*αʔόʔ qVFOUE qVFOUE "1*αʔόʔ

Slide 5

Slide 5 text

3FETIJGUʹͳ͍σʔλΛ ෼ੳूܭ͢Δʹ͸ w ΞϓϦͷػೳ௥Ճ͸೔ʑ࣮ࢪ͍ͯ͠Δ w 3FETIJGUʹͳ͍σʔλΛҰ࣌తʹूܭ͍ͨ͠৔߹΋͋Δ w 4ʹ͸શϩά͕อଘ͞Ε͍ͯΔ

Slide 6

Slide 6 text

4QBSLͷ͕͜͜͏Ε͍͠ w 4ͱͷ਌࿨ੑ͕ߴ͍ w +40/෦෼ͷऔΓग़͕͠༰қ w 4QBSL42-ɺ%BUB'SBNF͕࢖͑Δ ※ίʔυΠϝʔδ val textFile = sc.textFile(“s3n://bucket-name/path/*/*.gz”) val json = textFile.map(…) val table = sqlContext.jsonRDD(json) … val data = sqlContext.sql(“SELECT … “)

Slide 7

Slide 7 text

෼ࢄج൫ͷӡ༻͸େม w ͪΐͬͱͨ͠ूܭʹ͍͍ͪͪߏஙͨ͘͠ͳ͍͠ɺ ෼ࢄج൫Λӡ༻͢Δͷ͸΋ͬͱେม BXTFNSDSFBUFDMVTUFSOBNF4QBSL$MVTUFSBNJWFSTJPO JOTUBODFUZQFNYMBSHFJOTUBODFDPVOUa FDBUUSJCVUFT,FZ/BNF.:,&:BQQMJDBUJPOT/BNF)JWFa CPPUTUSBQBDUJPOT1BUITTVQQPSUFMBTUJDNBQSFEVDFTQBSL JOTUBMMTQBSL IUUQTHJUIVCDPNBXTMBCTFNSCPPUTUSBQBDUJPOTUSFFNBTUFSTQBSL IUUQRJJUBDPNTIVOTVLFBJIBSBJUFNTCFEDGDG IUUQTBXTBNB[PODPNKQCMPHTBXTOFXBQBDIFTQBSLPOBNB[POFNS &MBTUJD .BQ3FEVDF w ਖ਼ࣜαϙʔτ͞Ε·ͨ͠   I’m happy to announce that Amazon EMR now supports Apache Spark.

Slide 8

Slide 8 text

·ͱΊ 4QBSLΛࢼ͢ʹ͸ &.3͕Φεεϝ αΫοͱ