GunosyにおけるSparkStreaming活用事例

Ee2cf288bdebf40777f2e8e874ec285c?s=47 moyomot
February 08, 2016
4.1k

 GunosyにおけるSparkStreaming活用事例

Ee2cf288bdebf40777f2e8e874ec285c?s=128

moyomot

February 08, 2016
Tweet

Transcript

  1. (VOPTZʹ͓͚Δ 4QBSL4USFBNJOH׆༻ࣄྫ )BEPPQ4QBSL$POGFSFODF+BQBO ৿ຊ३࢘

  2. (VOPTZͷ͝঺հ w ά ϊγʔ͸৘ใΩϡϨʔγϣϯαʔϏε w ৘ใΛੈքதͷਓʹ࠷దʹಧ͚Δ w μ΢ϯϩʔυ਺ ສ%-

  3. ࣗݾ঺հ w ৿ຊ३࢘ "UTVTIJ.PSJNPUP  w ΤϯδχΞ!(VOPTZ w ϩάج൫੔උهࣄ഑৴ϩδοΫվળσʔλ෼ੳ

  4. 4QBSL4USFBNJOH׆༻ࣄྫͷ͝঺հ w ΑΓ࠷దͳχϡʔεΛ഑৴͢ΔͨΊʹɺهࣄ഑৴ΞϧΰϦζϜ͸ ΞΫηεϩάΛར༻͍ͯ͠Δ w هࣄ഑৴ΞϧΰϦζϜ͸௚ۙ/෼ؒͷΞΫηεϩάΛࢀর w ैདྷͷ௚ۙ/෼ؒΞΫηεϩάूܭͷ࢓૊Έʹ͸՝୊͕͋Δͷ Ͱɺ4QBSL4USFBNJOHͰ࡮৽͍ͨ͠ w

    શମߏ੒֓ཁͱσʔλอଘ෦෼ͷৄࡉʹ͍ͭͯ͝঺հ
  5. ैདྷͷ࢓૊Έ هࣄ഑৴ ΞϧΰϦζϜ w 'MVFOUE͕௚઀σʔλϕʔεʹϩάΛॻ͖ࠐΈ w ՝୊ w εέʔϧΞ΢τ͕ࠔ೉ w

    ଎ใ஋ͷͨΊهࣄ഑৴ΞϧΰϦζϜ͕ࢀর͢Δϩά ूܭ஋ʹ͸ޡ͕ࠩൃੜ͢Δ w ޡࠩΛ࠷খԽ͍ͨ͠ "1* αʔό
  6. ετϦʔϛϯάج൫બఆ w ػೳཁ݅ w ैདྷػೳͷ࣮ݱޡࠩͷ࠷খԽˍϩάछྨ͕૿͑ͨ࣌ͷॊೈͳରԠ w ӡ༻ཁ݅ w ӡ༻ίετ࡟ݮͷͨΊɺ"84αʔϏεத৺ʹར༻͍ͨ͠ w

    શମߏ੒ "1* αʔό Amazon EMR Spark Streaming Amazon Kinesis stream Amazon RDS MySQL DynamoDB
  7. ,JOFTJT4USFBN 4QBSL4USFBNJOH w ,JOFTJT4USFBN 4QBSL4USFBNJOH w 4QBSL4USFBNJOHɿετϦʔϛϯάॲཧج൫ w ,JOFTJT4USFBNɿϑϧϚωʔδυͷ,BGLBͷΠϝʔδʢ෼ࢄϝοηʔδج൫ʣ w

    ૬ੑͷྑ͍మ൘ͷߏ੒ w %ZOBNP%# w 4QBSL͕,JOFTJT͔ΒͲ͜·ͰσʔλΛಡΈऔ͔ͬͨΛه࿥ w ϑϨʔϜϫʔΫ͕ྑ͖ʹ͸͔Βͬͯ͘ΕΔ w 3%4 .Z42-  w 4QBSL͕ूܭͨ͠ΞΫηεϩάΛอଘ w طଘͷ࢓૊Έͱͷ਌࿨ੑΛߟྀ͠3%#Λબ୒ Amazon Kinesis stream Amazon EMR Spark Streaming Amazon RDS MySQL DynamoDB
  8. 4QBSL4USFBNJOH͔Β3%#Λ࢖༻͢Δ্Ͱͷߟྀ఺ w %BUBCSJDLTυΩϡϝϯτʹ஫ҙࣄ߲͕هࡌ w IUUQTEBUBCSJDLTHJUCPPLTJPEBUBCSJDLTTQBSLSFGFSFODFBQQMJDBUJPOTDPOUFOUMPHT@BOBMZ[FSDIBQUFS TBWF@BO@SEE@UP@B@EBUBCBTFIUNM w 42-ϥΠϒϥϦͲ͏͢Δʁ w 4MJDLPS4DBMJLF+%#$

    w 4DBMJLF+%#$Λબ୒ w όϧΫΠϯαʔτͷॻ͖΍͢͞ w ίωΫγϣϯϓʔϦϯάͷ࢖͍΍͢͞ w σʔλͷ৆ຯظݶʹ஫ҙ w σʔλͷอଘྔʹରͯ͠ɺ͍ͭ·Ͱ࢖༻͢Δͷ͔ "DPNNPOOBJWFNJTUBLFJTUPPQFOBDPOOFDUJPOPOUIF4QBSLESJWFS QSPHSBN BOEUIFOUSZUPVTFUIBUDPOOFDUJPOPOUIF4QBSLXPSLFST
  9. σʔλอଘͷαϯϓϧίʔυ rdd.foreachPartition { data => val seqData = data.toSeq.map(…) //ScalikeJDBC͕ड͚औΕΔܗࣜʹม׵


    // ίωΫγϣϯϓʔϧͷઃఆ foreachPartitionͷதͰ͸Δ ConnectionPool.singleton(DB_URL, DB_USER, DB_PASSWORD,
 ConnectionPoolSettings(connectionPoolFactoryName = "commons-dbcp2"))
 using(ConnectionPool.borrow()) { conn =>
 val db: DB = DB(conn)
 db.autoCommit { implicit session =>
 SQL(MY_INSERT_SQL)
 .batchByName(seqData: _*) // Bulk Insert
 .apply()
 }
 }
 }
  10. ·ͱΊ w 4QBSL4USFBNJOH ,JOFTJT4USFBNJOHͷ૬ੑ ͸ྑ͍ w 4USFBNJOHूܭ݁Ռͷอଘʹ3%#͸͋Γ w ӡ༻ίετΛ཈͑ΒΕΔߏ੒ʹ͠·͠ΐ͏