Upgrade to Pro — share decks privately, control downloads, hide ads and more …

adhoc analysis apache spark

moyomot
June 23, 2015
990

adhoc analysis apache spark

moyomot

June 23, 2015
Tweet

Transcript

  1. ΞυϗοΫ෼ੳͰ׆༂͢Δ
    "QBDIF4QBSL
    4QBSL$BTVBM5BML
    [email protected]

    View Slide

  2. ࣗݾ঺հ
    w ৿ຊ३࢘ "UTVTIJ.PSJNPUP

    w ΤϯδχΞ!(VOPTZ
    w σʔλ෼ੳαʔόαΠυΞϓϦ։ൃ

    View Slide

  3. ຊ೔͓఻͍͑ͨ͜͠ͱ
    4QBSLΛࢼ͢ʹ͸
    &.3͕Φεεϝ
    αΫοͱ

    View Slide

  4. σʔλ෼ੳج൫
    w ೔ʑͷσʔλ෼ੳͷओ࣠͸3FETIJGU
    w 42-ͰσʔλΛूܭͰ͖Δͷ͸େ͖ͳڧΈ
    "1*αʔόʔ
    ඞཁͳ෦෼Λ੔ܗ
    qVFOUE
    4 3FETIJGU
    "1*αʔόʔ qVFOUE
    qVFOUE
    "1*αʔόʔ

    View Slide

  5. 3FETIJGUʹͳ͍σʔλΛ
    ෼ੳूܭ͢Δʹ͸
    w ΞϓϦͷػೳ௥Ճ͸೔ʑ࣮ࢪ͍ͯ͠Δ
    w 3FETIJGUʹͳ͍σʔλΛҰ࣌తʹूܭ͍ͨ͠৔߹΋͋Δ
    w 4ʹ͸શϩά͕อଘ͞Ε͍ͯΔ

    View Slide

  6. 4QBSLͷ͕͜͜͏Ε͍͠
    w 4ͱͷ਌࿨ੑ͕ߴ͍
    w +40/෦෼ͷऔΓग़͕͠༰қ
    w 4QBSL42-ɺ%BUB'SBNF͕࢖͑Δ
    ※ίʔυΠϝʔδ
    val textFile = sc.textFile(“s3n://bucket-name/path/*/*.gz”)
    val json = textFile.map(…)
    val table = sqlContext.jsonRDD(json)

    val data = sqlContext.sql(“SELECT … “)

    View Slide

  7. ෼ࢄج൫ͷӡ༻͸େม
    w ͪΐͬͱͨ͠ूܭʹ͍͍ͪͪߏஙͨ͘͠ͳ͍͠ɺ
    ෼ࢄج൫Λӡ༻͢Δͷ͸΋ͬͱେม
    BXTFNSDSFBUFDMVTUFSOBNF4QBSL$MVTUFSBNJWFSTJPO
    JOTUBODFUZQFNYMBSHFJOTUBODFDPVOUa
    FDBUUSJCVUFT,FZ/BNF.:,&:BQQMJDBUJPOT/BNF)JWFa
    CPPUTUSBQBDUJPOT1BUITTVQQPSUFMBTUJDNBQSFEVDFTQBSL
    JOTUBMMTQBSL
    IUUQTHJUIVCDPNBXTMBCTFNSCPPUTUSBQBDUJPOTUSFFNBTUFSTQBSL
    IUUQRJJUBDPNTIVOTVLFBJIBSBJUFNTCFEDGDG
    IUUQTBXTBNB[PODPNKQCMPHTBXTOFXBQBDIFTQBSLPOBNB[POFNS
    &MBTUJD
    .BQ3FEVDF
    w ਖ਼ࣜαϙʔτ͞Ε·ͨ͠

    I’m happy to announce that Amazon EMR now supports Apache Spark.

    View Slide

  8. ·ͱΊ
    4QBSLΛࢼ͢ʹ͸
    &.3͕Φεεϝ
    αΫοͱ

    View Slide