Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gunosy のログ収集基盤

15b203f6fce63643413074c0e5e801d1?s=47 mgi166
June 23, 2018
610

Gunosy のログ収集基盤

15b203f6fce63643413074c0e5e801d1?s=128

mgi166

June 23, 2018
Tweet

Transcript

 1. Gunosy ͷϩάऩूج൫ Architecture Night #1 ໜ໦େເ@mgi166 

 2. ࣗݾ঺հ • גࣜձࣾ Gunosy ։ൃɾӡ༻ਪਐ෦ • ໜ໦େເ • ݩ Rails

  ΤϯδχΞ • ࠓ͸ Gunosy ͷ AWS पΓɺΠϯϑϥ पΓΛ୲౰ 
 3. ຊ೔ͷ಺༰ • Gunosy ͷϩάपΓΛத৺ͱͨ͠ɺΠϯϑϥ ΞʔΩςΫνϟͷࣄྫΛ঺հ͠·͢ 

 4. ϩάΛूΊΔ 

 5. ͳΜͰϩάΛूΊΔͷʁ 

 6. ྫ͑͹͜Μͳཧ༝ • ࣮ࢪͨ͠ࢪࡦͷ൓ԠΛݟΔͨΊ • Ϣʔβʔ͸Ͳ͏൓Ԡͨ͠ͷ͔ʁΛ஌Δ • ϓϩμΫτΛΑΓྑ͘ϒϥογϡΞοϓ͢ΔͨΊ • ϩάΛ input

  ʹɺػցֶशͰ࡞ͬͨϞσϧΛڧ͍ͯ͘͘͠ • Τϥʔ͕͋ͬͨͱ͖ͷݪҼௐࠪ • ϩά͕ͳ͍ͱݪҼΛ௥͑ͳ͍ • ϩάʹྲྀΕΔϝοηʔδ͔ΒɺΞϥʔτΛఆٛͯ͠ɺҟৗʹؾ͖͍ͮͨ • ා͍ਓʹ٧ΊΒΕͨͱ͖ʹɺূڌͱͳΔه࿥Λग़ͨ͢Ί • ϩάΛஷΊΔ͜ͱ͕ɺސ٬ʹఏڙ͢ΔՁ஋Ͱ͋Δ • e.x.) Papertrail, mackerel.io, etc… 
 7. ྫ͑͹͜Μͳཧ༝ • ࣮ࢪͨ͠ࢪࡦͷ൓ԠΛݟΔͨΊ • Ϣʔβʔ͸Ͳ͏൓Ԡͨ͠ͷ͔ʁΛ஌Δ • ϓϩμΫτΛΑΓྑ͘ϒϥογϡΞοϓ͢ΔͨΊ • ϩάΛ input

  ʹɺػցֶशͰ࡞ͬͨϞσϧΛڧ͍ͯ͘͘͠ • Τϥʔ͕͋ͬͨͱ͖ͷݪҼௐࠪ • ϩά͕ͳ͍ͱݪҼΛ௥͑ͳ͍ • ϩάʹྲྀΕΔϝοηʔδ͔ΒɺΞϥʔτΛఆٛͯ͠ɺҟৗʹؾ͖͍ͮͨ • ා͍ਓʹ٧ΊΒΕͨͱ͖ʹɺূڌͱͳΔه࿥Λग़ͨ͢Ί • ϩάΛஷΊΔ͜ͱ͕ɺސ٬ʹఏڙ͢ΔՁ஋Ͱ͋Δ • e.x.) Papertrail, mackerel.io, etc… 
 8. ຊ೔ͷ಺༰ • Gunosy Ͱ͸ϩάΛͲ͏ूΊͯɺͲ͏׆༻ͯ͠ ͍Δ͔ • ෼ੳ༻్Ͱ׆༻͢Δ • ϓϩμΫτվળ༻్Ͱ׆༻͢Δ 

 9. ෼ੳ༻్ 

 10. ·ͣ Gunosy ͷུ֓ਤ(ࡶver) ػցֶश σʔλஔ͖৔ API αʔόʔ 

 11. ෼ੳʹؔ܎͍ͯ͠Δ෦෼͸͜͜ ػցֶश σʔλஔ͖৔ API αʔόʔ

 12. ϩά͸Ͳ͏ྲྀΕ͍͔ͯ͘ • fluentd Ͱ S3 or BQ ΁ • ETL

  or όονͰ DB ʹอଘ • Re:dash ͰՄࢹԽ API server
 13. Re:dash ͷ DataSource ͸ όονͰੜ੒ raw log Redshift Re:dash formated

  log parquet BigQuery batch ECS RDS
 14. Re:dash ͸ேձͰຖ೔ݟΔ 

 15. ͜͜·Ͱͷ·ͱΊ • ීஈͷϩάͷҰ࣍ஔ͖৔͸ S3 ͱ BQ • ඞཁʹԠͯ͡ DB ʹϩʔυͯ͠ɺRe:dash

  ͰՄ ࢹԽ • ՄࢹԽͨ͠ Re:dash ͷओͳ KPI ͸ேձͰڞ༗ • (ߏ੒ࣗମ͸ׂͱී௨) 
 16. ϓϩμΫτվળ༻్ 

 17. Ͳ͏͍͏ϩά͕Ͳ͏࢖ΘΕΔʁ • Ͳ͏͍͏ϩά͕ʁ • Ϣʔβʔͷߦಈϩά • هࣄͷΫϦοΫϩά • ޿ࠂͷΠϯϓϨογϣϯ •

  ͦͷଞ৭ʑ… • Ͳ͏࢖ΘΕΔʁ • Ϣʔβʔʹ࠷దͳ৘ใΛಧ͚ΔͨΊʹੜ͔͞ΕΔ • ϢʔβʔͻͱΓͻͱΓʹରͯ͠࠷దͳهࣄΛग़͢͜ͱʹ࢖ΘΕΔ • Ϣʔβʔʹରͯ͠ڵຯͷ͋Δ Push ௨஌Λଧͭ 
 18. શମ૾ click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2 user

  vector ͍͍ײ͡ʹ͢ΔϚϯ # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -ϫʔΫϑϩʔฤ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/gunosy-personalize-digdag-workflow)
 19. هࣄ৘ใͷείΞϦϯά click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2

  user vector # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -ϫʔΫϑϩʔฤ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/gunosy-personalize-digdag-workflow)
 20. Ϣʔβʔ৘ใͱهࣄσʔλ user data • S3 ʹཷ·͍ͬͯΔϩά͚ͩͰͳͯ͘ɺRDS ͔Β΋ఆظతʹɺϢʔβʔͷଐੑ৘ใΛ S3 ʹஔ͍ͯ ͍Δ

  • Ϣʔβʔ৘ใͳͲͷٵ্͍͛͸ɺόον + ఆظ࣮ߦͰ؅ཧ • digdag on ECS Ͱಈ͔͍ͯ͠Δ • ٵ্͍͛ͨσʔλͱϩάσʔλͱͷ join + ൿີͳʹ͔͸ɺAirflow on ECS Ͱ؅ཧ log ECS RDS ECS EMR ‟vector ‟vector
 21. Ϣʔβʔଐੑͷߋ৽ click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2 user

  vector GET ͍͍ײ͡ʹ͢ΔϚϯ
 22. ϦΞϧλΠϜͰϢʔβʔϕΫτϧ Λߋ৽ # Ϣʔβʔߦಈͷ਺ཧϞσϧͱ ߴ଎ਪનγεςϜ (https://speakerdeck.com/mathetake/yusaxing-dong-falseshu-li-moteruto-gao-su-tui-jian-sisutemu) Click log stream Push/Trim

  Trigger Put Put Batch Get Batch Get Click logger Article Vectorizer Crawler Article Vector User Vector • click log ͸ client ͔Β kinesis stream ʹ௚઀ૹΓɺ lambda Λ௨ͯ͠ dynamo ΁
 23. શମ૾(࠶ܝ) click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2

  user vector # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -1ΫϦοΫͰ࢝·ΔύʔιφϥΠζ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/realtime-vectorization-with-dynamodb)
 24. ͜͜·Ͱͷ·ͱΊ • ϢʔβʔͷߦಈϩάΛ΋ͱʹɺϓϩμΫτͷվળΛߦͬ ͍ͯΔ • ಛʹϢʔβʔͷΫϦοΫϩά͔ΒɺϢʔβʔϕΫτϧߋ ৽Λ΄΅ϦΞϧλΠϜߦ͍ͬͯΔ • Ұ෦͸ϦΞϧλΠϜͳධՁΛߦ͍ͬͯͳͯ͘ɺόονͰ ରԠ

  • ͜ΕΒͷόον͸ɺdigdag on ECS Ͱߦ͍ͬͯΔ 
 25. ·ͱΊ 

 26. ӡ༻͍ͯͯ͠ࢥ͏͜ͱ • ྑ͘΋ѱ͘΋ AWS Ͳͬ΀Γਁ͔ͬͯΔ • ϑϧϚωʔδυͰɺॳظಋೖɺӡ༻ίετ͸௿͍΋ͷͷɺ AWS ͷ࢓༷ʹҾͬுΒΕ ͍ͯΔ෦෼͸৭ʑ͋Δͱࢥ͏

  • ϩά΍෼ੳ༻్ͷ DBɺS3 bucket ͸֤ॴʹࢄΒ͹ͬͯΔͷͰɺΧΦεײ • ͜Ε࢖ͬͯΔ͚ͬʁɺ࣮͸΄ͱΜͲ࢖ͬͯͳ͍ɺΈ͍ͨͷ͸͍͔ͭ͋͘Δ͸ͣ • ෼ੳ༻్Ͱ࢖͏ϩάͷεΩʔϚ؅ཧʹ͍ͭͯ͸೰·͍͠ • json Ͱอଘ͠ɺඞཁʹԠͯ͡ json Λ෼ղͨ͠Ұ࣌ςʔϒϧΛ࡞ͬͯରॲ • σʔλ͸ࣗ༝ʹ৮ΕΔҰํɺύʔςΟγϣϯࢦఆͤͣɺϑϧεΩϟϯΫΤϦ࣮ߦ͠ଠ࿠ ͕ఆظతʹग़ݱ͢Δ • ϚωʔతͳҙຯͰ͏͔ͬΓ௧खΛ௥͏ͷ͸ආ͚͍͕ͨɺ͔ͩΒͱ͍ͬͯΞΫηεݖ Λୣ͏ͷ͸ҧ͏ؾ͕͢Δ 
 27. ·ͱΊ • Gunosy Ͱѻ͍ͬͯΔϩάΛɺʮ෼ੳ༻్ʯʮϓϩμΫτվળ༻్ʯ෼͚ ͯͦΕͧΕ͝঺հ͠·ͨ͠ • ෼ੳ༻్Ͱ͸ɺηΦϦʔ௨Γͷߏ੒ • S3 ->

  தؒσʔλετϨʔδ -> Re:dash • ओཁͳ KPI ͸ɺRe:dash Λ࢖ͬͯάϥϑԽ͠ɺேձͰຖ೔ڞ༗͍ͯ͠Δ • ϓϩμΫτվળ༻్Ͱ͸ɺόον + ϦΞϧλΠϜͰϩάΛ࢖͍ͬͯ·͢ • AWS ʹେ͖͘ॿ͚ΒΕ͍ͯ·͢ 
 28. ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠