Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gunosy のログ収集基盤

15b203f6fce63643413074c0e5e801d1?s=47 mgi166
June 23, 2018
610

Gunosy のログ収集基盤

15b203f6fce63643413074c0e5e801d1?s=128

mgi166

June 23, 2018
Tweet

Transcript

  1. Gunosy ͷϩάऩूج൫ Architecture Night #1 ໜ໦େເ@mgi166 

  2. ࣗݾ঺հ • גࣜձࣾ Gunosy ։ൃɾӡ༻ਪਐ෦ • ໜ໦େເ • ݩ Rails

    ΤϯδχΞ • ࠓ͸ Gunosy ͷ AWS पΓɺΠϯϑϥ पΓΛ୲౰ 
  3. ຊ೔ͷ಺༰ • Gunosy ͷϩάपΓΛத৺ͱͨ͠ɺΠϯϑϥ ΞʔΩςΫνϟͷࣄྫΛ঺հ͠·͢ 

  4. ϩάΛूΊΔ 

  5. ͳΜͰϩάΛूΊΔͷʁ 

  6. ྫ͑͹͜Μͳཧ༝ • ࣮ࢪͨ͠ࢪࡦͷ൓ԠΛݟΔͨΊ • Ϣʔβʔ͸Ͳ͏൓Ԡͨ͠ͷ͔ʁΛ஌Δ • ϓϩμΫτΛΑΓྑ͘ϒϥογϡΞοϓ͢ΔͨΊ • ϩάΛ input

    ʹɺػցֶशͰ࡞ͬͨϞσϧΛڧ͍ͯ͘͘͠ • Τϥʔ͕͋ͬͨͱ͖ͷݪҼௐࠪ • ϩά͕ͳ͍ͱݪҼΛ௥͑ͳ͍ • ϩάʹྲྀΕΔϝοηʔδ͔ΒɺΞϥʔτΛఆٛͯ͠ɺҟৗʹؾ͖͍ͮͨ • ා͍ਓʹ٧ΊΒΕͨͱ͖ʹɺূڌͱͳΔه࿥Λग़ͨ͢Ί • ϩάΛஷΊΔ͜ͱ͕ɺސ٬ʹఏڙ͢ΔՁ஋Ͱ͋Δ • e.x.) Papertrail, mackerel.io, etc… 
  7. ྫ͑͹͜Μͳཧ༝ • ࣮ࢪͨ͠ࢪࡦͷ൓ԠΛݟΔͨΊ • Ϣʔβʔ͸Ͳ͏൓Ԡͨ͠ͷ͔ʁΛ஌Δ • ϓϩμΫτΛΑΓྑ͘ϒϥογϡΞοϓ͢ΔͨΊ • ϩάΛ input

    ʹɺػցֶशͰ࡞ͬͨϞσϧΛڧ͍ͯ͘͘͠ • Τϥʔ͕͋ͬͨͱ͖ͷݪҼௐࠪ • ϩά͕ͳ͍ͱݪҼΛ௥͑ͳ͍ • ϩάʹྲྀΕΔϝοηʔδ͔ΒɺΞϥʔτΛఆٛͯ͠ɺҟৗʹؾ͖͍ͮͨ • ා͍ਓʹ٧ΊΒΕͨͱ͖ʹɺূڌͱͳΔه࿥Λग़ͨ͢Ί • ϩάΛஷΊΔ͜ͱ͕ɺސ٬ʹఏڙ͢ΔՁ஋Ͱ͋Δ • e.x.) Papertrail, mackerel.io, etc… 
  8. ຊ೔ͷ಺༰ • Gunosy Ͱ͸ϩάΛͲ͏ूΊͯɺͲ͏׆༻ͯ͠ ͍Δ͔ • ෼ੳ༻్Ͱ׆༻͢Δ • ϓϩμΫτվળ༻్Ͱ׆༻͢Δ 

  9. ෼ੳ༻్ 

  10. ·ͣ Gunosy ͷུ֓ਤ(ࡶver) ػցֶश σʔλஔ͖৔ API αʔόʔ 

  11. ෼ੳʹؔ܎͍ͯ͠Δ෦෼͸͜͜ ػցֶश σʔλஔ͖৔ API αʔόʔ

  12. ϩά͸Ͳ͏ྲྀΕ͍͔ͯ͘ • fluentd Ͱ S3 or BQ ΁ • ETL

    or όονͰ DB ʹอଘ • Re:dash ͰՄࢹԽ  API server
  13. Re:dash ͷ DataSource ͸ όονͰੜ੒ raw log Redshift Re:dash formated

    log parquet BigQuery batch ECS  RDS
  14. Re:dash ͸ேձͰຖ೔ݟΔ 

  15. ͜͜·Ͱͷ·ͱΊ • ීஈͷϩάͷҰ࣍ஔ͖৔͸ S3 ͱ BQ • ඞཁʹԠͯ͡ DB ʹϩʔυͯ͠ɺRe:dash

    ͰՄ ࢹԽ • ՄࢹԽͨ͠ Re:dash ͷओͳ KPI ͸ேձͰڞ༗ • (ߏ੒ࣗମ͸ׂͱී௨) 
  16. ϓϩμΫτվળ༻్ 

  17. Ͳ͏͍͏ϩά͕Ͳ͏࢖ΘΕΔʁ • Ͳ͏͍͏ϩά͕ʁ • Ϣʔβʔͷߦಈϩά • هࣄͷΫϦοΫϩά • ޿ࠂͷΠϯϓϨογϣϯ •

    ͦͷଞ৭ʑ… • Ͳ͏࢖ΘΕΔʁ • Ϣʔβʔʹ࠷దͳ৘ใΛಧ͚ΔͨΊʹੜ͔͞ΕΔ • ϢʔβʔͻͱΓͻͱΓʹରͯ͠࠷దͳهࣄΛग़͢͜ͱʹ࢖ΘΕΔ • Ϣʔβʔʹରͯ͠ڵຯͷ͋Δ Push ௨஌Λଧͭ 
  18. શମ૾ click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2 user

    vector ͍͍ײ͡ʹ͢ΔϚϯ # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -ϫʔΫϑϩʔฤ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/gunosy-personalize-digdag-workflow)
  19. هࣄ৘ใͷείΞϦϯά click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2

    user vector # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -ϫʔΫϑϩʔฤ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/gunosy-personalize-digdag-workflow)
  20. Ϣʔβʔ৘ใͱهࣄσʔλ user data • S3 ʹཷ·͍ͬͯΔϩά͚ͩͰͳͯ͘ɺRDS ͔Β΋ఆظతʹɺϢʔβʔͷଐੑ৘ใΛ S3 ʹஔ͍ͯ ͍Δ

    • Ϣʔβʔ৘ใͳͲͷٵ্͍͛͸ɺόον + ఆظ࣮ߦͰ؅ཧ • digdag on ECS Ͱಈ͔͍ͯ͠Δ • ٵ্͍͛ͨσʔλͱϩάσʔλͱͷ join + ൿີͳʹ͔͸ɺAirflow on ECS Ͱ؅ཧ log ECS RDS ECS EMR ‟vector ‟vector
  21. Ϣʔβʔଐੑͷߋ৽ click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2 user

    vector GET ͍͍ײ͡ʹ͢ΔϚϯ
  22. ϦΞϧλΠϜͰϢʔβʔϕΫτϧ Λߋ৽ # Ϣʔβʔߦಈͷ਺ཧϞσϧͱ ߴ଎ਪનγεςϜ (https://speakerdeck.com/mathetake/yusaxing-dong-falseshu-li-moteruto-gao-su-tui-jian-sisutemu) Click log stream Push/Trim

    Trigger Put Put Batch Get Batch Get Click logger Article Vectorizer Crawler Article Vector User Vector • click log ͸ client ͔Β kinesis stream ʹ௚઀ૹΓɺ lambda Λ௨ͯ͠ dynamo ΁
  23. શମ૾(࠶ܝ) click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2

    user vector # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -1ΫϦοΫͰ࢝·ΔύʔιφϥΠζ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/realtime-vectorization-with-dynamodb)
  24. ͜͜·Ͱͷ·ͱΊ • ϢʔβʔͷߦಈϩάΛ΋ͱʹɺϓϩμΫτͷվળΛߦͬ ͍ͯΔ • ಛʹϢʔβʔͷΫϦοΫϩά͔ΒɺϢʔβʔϕΫτϧߋ ৽Λ΄΅ϦΞϧλΠϜߦ͍ͬͯΔ • Ұ෦͸ϦΞϧλΠϜͳධՁΛߦ͍ͬͯͳͯ͘ɺόονͰ ରԠ

    • ͜ΕΒͷόον͸ɺdigdag on ECS Ͱߦ͍ͬͯΔ 
  25. ·ͱΊ 

  26. ӡ༻͍ͯͯ͠ࢥ͏͜ͱ • ྑ͘΋ѱ͘΋ AWS Ͳͬ΀Γਁ͔ͬͯΔ • ϑϧϚωʔδυͰɺॳظಋೖɺӡ༻ίετ͸௿͍΋ͷͷɺ AWS ͷ࢓༷ʹҾͬுΒΕ ͍ͯΔ෦෼͸৭ʑ͋Δͱࢥ͏

    • ϩά΍෼ੳ༻్ͷ DBɺS3 bucket ͸֤ॴʹࢄΒ͹ͬͯΔͷͰɺΧΦεײ • ͜Ε࢖ͬͯΔ͚ͬʁɺ࣮͸΄ͱΜͲ࢖ͬͯͳ͍ɺΈ͍ͨͷ͸͍͔ͭ͋͘Δ͸ͣ • ෼ੳ༻్Ͱ࢖͏ϩάͷεΩʔϚ؅ཧʹ͍ͭͯ͸೰·͍͠ • json Ͱอଘ͠ɺඞཁʹԠͯ͡ json Λ෼ղͨ͠Ұ࣌ςʔϒϧΛ࡞ͬͯରॲ • σʔλ͸ࣗ༝ʹ৮ΕΔҰํɺύʔςΟγϣϯࢦఆͤͣɺϑϧεΩϟϯΫΤϦ࣮ߦ͠ଠ࿠ ͕ఆظతʹग़ݱ͢Δ • ϚωʔతͳҙຯͰ͏͔ͬΓ௧खΛ௥͏ͷ͸ආ͚͍͕ͨɺ͔ͩΒͱ͍ͬͯΞΫηεݖ Λୣ͏ͷ͸ҧ͏ؾ͕͢Δ 
  27. ·ͱΊ • Gunosy Ͱѻ͍ͬͯΔϩάΛɺʮ෼ੳ༻్ʯʮϓϩμΫτվળ༻్ʯ෼͚ ͯͦΕͧΕ͝঺հ͠·ͨ͠ • ෼ੳ༻్Ͱ͸ɺηΦϦʔ௨Γͷߏ੒ • S3 ->

    தؒσʔλετϨʔδ -> Re:dash • ओཁͳ KPI ͸ɺRe:dash Λ࢖ͬͯάϥϑԽ͠ɺேձͰຖ೔ڞ༗͍ͯ͠Δ • ϓϩμΫτվળ༻్Ͱ͸ɺόον + ϦΞϧλΠϜͰϩάΛ࢖͍ͬͯ·͢ • AWS ʹେ͖͘ॿ͚ΒΕ͍ͯ·͢ 
  28. ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠