Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GunosyにおけるSparkStreaming活用事例

moyomot
February 08, 2016
4.8k

 GunosyにおけるSparkStreaming活用事例

moyomot

February 08, 2016
Tweet

Transcript

  1. (VOPTZʹ͓͚Δ
    4QBSL4USFBNJOH׆༻ࣄྫ
    )BEPPQ4QBSL$POGFSFODF+BQBO
    ৿ຊ३࢘

    View Slide

  2. (VOPTZͷ͝঺հ
    w ά
    ϊγʔ͸৘ใΩϡϨʔγϣϯαʔϏε
    w ৘ใΛੈքதͷਓʹ࠷దʹಧ͚Δ
    w μ΢ϯϩʔυ਺ ສ%-

    View Slide

  3. ࣗݾ঺հ
    w ৿ຊ३࢘ "UTVTIJ.PSJNPUP

    w ΤϯδχΞ!(VOPTZ
    w ϩάج൫੔උهࣄ഑৴ϩδοΫվળσʔλ෼ੳ

    View Slide

  4. 4QBSL4USFBNJOH׆༻ࣄྫͷ͝঺հ
    w ΑΓ࠷దͳχϡʔεΛ഑৴͢ΔͨΊʹɺهࣄ഑৴ΞϧΰϦζϜ͸
    ΞΫηεϩάΛར༻͍ͯ͠Δ
    w هࣄ഑৴ΞϧΰϦζϜ͸௚ۙ/෼ؒͷΞΫηεϩάΛࢀর
    w ैདྷͷ௚ۙ/෼ؒΞΫηεϩάूܭͷ࢓૊Έʹ͸՝୊͕͋Δͷ
    Ͱɺ4QBSL4USFBNJOHͰ࡮৽͍ͨ͠
    w શମߏ੒֓ཁͱσʔλอଘ෦෼ͷৄࡉʹ͍ͭͯ͝঺հ

    View Slide

  5. ैདྷͷ࢓૊Έ
    هࣄ഑৴
    ΞϧΰϦζϜ

    w 'MVFOUE͕௚઀σʔλϕʔεʹϩάΛॻ͖ࠐΈ
    w ՝୊
    w εέʔϧΞ΢τ͕ࠔ೉
    w ଎ใ஋ͷͨΊهࣄ഑৴ΞϧΰϦζϜ͕ࢀর͢Δϩά
    ूܭ஋ʹ͸ޡ͕ࠩൃੜ͢Δ
    w ޡࠩΛ࠷খԽ͍ͨ͠
    "1*
    αʔό

    View Slide

  6. ετϦʔϛϯάج൫બఆ
    w ػೳཁ݅
    w ैདྷػೳͷ࣮ݱޡࠩͷ࠷খԽˍϩάछྨ͕૿͑ͨ࣌ͷॊೈͳରԠ
    w ӡ༻ཁ݅
    w ӡ༻ίετ࡟ݮͷͨΊɺ"84αʔϏεத৺ʹར༻͍ͨ͠
    w શମߏ੒
    "1*
    αʔό Amazon EMR
    Spark Streaming
    Amazon
    Kinesis stream
    Amazon RDS
    MySQL
    DynamoDB

    View Slide

  7. ,JOFTJT4USFBN4QBSL4USFBNJOH
    w ,JOFTJT4USFBN4QBSL4USFBNJOH
    w 4QBSL4USFBNJOHɿετϦʔϛϯάॲཧج൫
    w ,JOFTJT4USFBNɿϑϧϚωʔδυͷ,BGLBͷΠϝʔδʢ෼ࢄϝοηʔδج൫ʣ
    w ૬ੑͷྑ͍మ൘ͷߏ੒
    w %ZOBNP%#
    w 4QBSL͕,JOFTJT͔ΒͲ͜·ͰσʔλΛಡΈऔ͔ͬͨΛه࿥
    w ϑϨʔϜϫʔΫ͕ྑ͖ʹ͸͔Βͬͯ͘ΕΔ
    w 3%4 .Z42-

    w 4QBSL͕ूܭͨ͠ΞΫηεϩάΛอଘ
    w طଘͷ࢓૊Έͱͷ਌࿨ੑΛߟྀ͠3%#Λબ୒
    Amazon
    Kinesis stream
    Amazon EMR
    Spark Streaming
    Amazon RDS
    MySQL
    DynamoDB

    View Slide

  8. 4QBSL4USFBNJOH͔Β3%#Λ࢖༻͢Δ্Ͱͷߟྀ఺
    w %BUBCSJDLTυΩϡϝϯτʹ஫ҙࣄ߲͕هࡌ
    w IUUQTEBUBCSJDLTHJUCPPLTJPEBUBCSJDLTTQBSLSFGFSFODFBQQMJDBUJPOTDPOUFOUMPHT@BOBMZ[FSDIBQUFS
    TBWF@BO@SEE@UP@B@EBUBCBTFIUNM
    w 42-ϥΠϒϥϦͲ͏͢Δʁ
    w 4MJDLPS4DBMJLF+%#$
    w 4DBMJLF+%#$Λબ୒
    w όϧΫΠϯαʔτͷॻ͖΍͢͞
    w ίωΫγϣϯϓʔϦϯάͷ࢖͍΍͢͞
    w σʔλͷ৆ຯظݶʹ஫ҙ
    w σʔλͷอଘྔʹରͯ͠ɺ͍ͭ·Ͱ࢖༻͢Δͷ͔
    "DPNNPOOBJWFNJTUBLFJTUPPQFOBDPOOFDUJPOPOUIF4QBSLESJWFS
    QSPHSBN BOEUIFOUSZUPVTFUIBUDPOOFDUJPOPOUIF4QBSLXPSLFST

    View Slide

  9. σʔλอଘͷαϯϓϧίʔυ
    rdd.foreachPartition { data =>
    val seqData = data.toSeq.map(…) //ScalikeJDBC͕ड͚औΕΔܗࣜʹม׵

    // ίωΫγϣϯϓʔϧͷઃఆ foreachPartitionͷதͰ͸Δ
    ConnectionPool.singleton(DB_URL, DB_USER, DB_PASSWORD,

    ConnectionPoolSettings(connectionPoolFactoryName = "commons-dbcp2"))

    using(ConnectionPool.borrow()) { conn =>

    val db: DB = DB(conn)

    db.autoCommit { implicit session =>

    SQL(MY_INSERT_SQL)

    .batchByName(seqData: _*) // Bulk Insert

    .apply()

    }

    }

    }

    View Slide

  10. ·ͱΊ
    w 4QBSL4USFBNJOH,JOFTJT4USFBNJOHͷ૬ੑ
    ͸ྑ͍
    w 4USFBNJOHूܭ݁Ռͷอଘʹ3%#͸͋Γ
    w ӡ༻ίετΛ཈͑ΒΕΔߏ੒ʹ͠·͠ΐ͏

    View Slide