Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gunosy のログ収集基盤

mgi166
June 23, 2018
690

Gunosy のログ収集基盤

mgi166

June 23, 2018
Tweet

Transcript

  1. ࣗݾ঺հ • גࣜձࣾ Gunosy ։ൃɾӡ༻ਪਐ෦ • ໜ໦େເ • ݩ Rails

    ΤϯδχΞ • ࠓ͸ Gunosy ͷ AWS पΓɺΠϯϑϥ पΓΛ୲౰ 
  2. ྫ͑͹͜Μͳཧ༝ • ࣮ࢪͨ͠ࢪࡦͷ൓ԠΛݟΔͨΊ • Ϣʔβʔ͸Ͳ͏൓Ԡͨ͠ͷ͔ʁΛ஌Δ • ϓϩμΫτΛΑΓྑ͘ϒϥογϡΞοϓ͢ΔͨΊ • ϩάΛ input

    ʹɺػցֶशͰ࡞ͬͨϞσϧΛڧ͍ͯ͘͘͠ • Τϥʔ͕͋ͬͨͱ͖ͷݪҼௐࠪ • ϩά͕ͳ͍ͱݪҼΛ௥͑ͳ͍ • ϩάʹྲྀΕΔϝοηʔδ͔ΒɺΞϥʔτΛఆٛͯ͠ɺҟৗʹؾ͖͍ͮͨ • ා͍ਓʹ٧ΊΒΕͨͱ͖ʹɺূڌͱͳΔه࿥Λग़ͨ͢Ί • ϩάΛஷΊΔ͜ͱ͕ɺސ٬ʹఏڙ͢ΔՁ஋Ͱ͋Δ • e.x.) Papertrail, mackerel.io, etc… 
  3. ྫ͑͹͜Μͳཧ༝ • ࣮ࢪͨ͠ࢪࡦͷ൓ԠΛݟΔͨΊ • Ϣʔβʔ͸Ͳ͏൓Ԡͨ͠ͷ͔ʁΛ஌Δ • ϓϩμΫτΛΑΓྑ͘ϒϥογϡΞοϓ͢ΔͨΊ • ϩάΛ input

    ʹɺػցֶशͰ࡞ͬͨϞσϧΛڧ͍ͯ͘͘͠ • Τϥʔ͕͋ͬͨͱ͖ͷݪҼௐࠪ • ϩά͕ͳ͍ͱݪҼΛ௥͑ͳ͍ • ϩάʹྲྀΕΔϝοηʔδ͔ΒɺΞϥʔτΛఆٛͯ͠ɺҟৗʹؾ͖͍ͮͨ • ා͍ਓʹ٧ΊΒΕͨͱ͖ʹɺূڌͱͳΔه࿥Λग़ͨ͢Ί • ϩάΛஷΊΔ͜ͱ͕ɺސ٬ʹఏڙ͢ΔՁ஋Ͱ͋Δ • e.x.) Papertrail, mackerel.io, etc… 
  4. ϩά͸Ͳ͏ྲྀΕ͍͔ͯ͘ • fluentd Ͱ S3 or BQ ΁ • ETL

    or όονͰ DB ʹอଘ • Re:dash ͰՄࢹԽ  API server
  5. ͜͜·Ͱͷ·ͱΊ • ීஈͷϩάͷҰ࣍ஔ͖৔͸ S3 ͱ BQ • ඞཁʹԠͯ͡ DB ʹϩʔυͯ͠ɺRe:dash

    ͰՄ ࢹԽ • ՄࢹԽͨ͠ Re:dash ͷओͳ KPI ͸ேձͰڞ༗ • (ߏ੒ࣗମ͸ׂͱී௨) 
  6. Ͳ͏͍͏ϩά͕Ͳ͏࢖ΘΕΔʁ • Ͳ͏͍͏ϩά͕ʁ • Ϣʔβʔͷߦಈϩά • هࣄͷΫϦοΫϩά • ޿ࠂͷΠϯϓϨογϣϯ •

    ͦͷଞ৭ʑ… • Ͳ͏࢖ΘΕΔʁ • Ϣʔβʔʹ࠷దͳ৘ใΛಧ͚ΔͨΊʹੜ͔͞ΕΔ • ϢʔβʔͻͱΓͻͱΓʹରͯ͠࠷దͳهࣄΛग़͢͜ͱʹ࢖ΘΕΔ • Ϣʔβʔʹରͯ͠ڵຯͷ͋Δ Push ௨஌Λଧͭ 
  7. શମ૾ click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2 user

    vector ͍͍ײ͡ʹ͢ΔϚϯ # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -ϫʔΫϑϩʔฤ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/gunosy-personalize-digdag-workflow)
  8. هࣄ৘ใͷείΞϦϯά click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2

    user vector # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -ϫʔΫϑϩʔฤ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/gunosy-personalize-digdag-workflow)
  9. Ϣʔβʔ৘ใͱهࣄσʔλ user data • S3 ʹཷ·͍ͬͯΔϩά͚ͩͰͳͯ͘ɺRDS ͔Β΋ఆظతʹɺϢʔβʔͷଐੑ৘ใΛ S3 ʹஔ͍ͯ ͍Δ

    • Ϣʔβʔ৘ใͳͲͷٵ্͍͛͸ɺόον + ఆظ࣮ߦͰ؅ཧ • digdag on ECS Ͱಈ͔͍ͯ͠Δ • ٵ্͍͛ͨσʔλͱϩάσʔλͱͷ join + ൿີͳʹ͔͸ɺAirflow on ECS Ͱ؅ཧ log ECS RDS ECS EMR ‟vector ‟vector
  10. ϦΞϧλΠϜͰϢʔβʔϕΫτϧ Λߋ৽ # Ϣʔβʔߦಈͷ਺ཧϞσϧͱ ߴ଎ਪનγεςϜ (https://speakerdeck.com/mathetake/yusaxing-dong-falseshu-li-moteruto-gao-su-tui-jian-sisutemu) Click log stream Push/Trim

    Trigger Put Put Batch Get Batch Get Click logger Article Vectorizer Crawler Article Vector User Vector • click log ͸ client ͔Β kinesis stream ʹ௚઀ૹΓɺ lambda Λ௨ͯ͠ dynamo ΁
  11. શମ૾(࠶ܝ) click/imp log log API server ‟vector ͍͍ײ͡ʹ͢ΔϚϯ ͍͍ײ͡ʹ͢ΔϚϯ ‟vector2

    user vector # GunosyͷύʔιφϥΠζΛࢧ͑Δٕज़ -1ΫϦοΫͰ࢝·ΔύʔιφϥΠζ- - Gunosy Tech Blog (https://tech.gunosy.io/entry/realtime-vectorization-with-dynamodb)
  12. ӡ༻͍ͯͯ͠ࢥ͏͜ͱ • ྑ͘΋ѱ͘΋ AWS Ͳͬ΀Γਁ͔ͬͯΔ • ϑϧϚωʔδυͰɺॳظಋೖɺӡ༻ίετ͸௿͍΋ͷͷɺ AWS ͷ࢓༷ʹҾͬுΒΕ ͍ͯΔ෦෼͸৭ʑ͋Δͱࢥ͏

    • ϩά΍෼ੳ༻్ͷ DBɺS3 bucket ͸֤ॴʹࢄΒ͹ͬͯΔͷͰɺΧΦεײ • ͜Ε࢖ͬͯΔ͚ͬʁɺ࣮͸΄ͱΜͲ࢖ͬͯͳ͍ɺΈ͍ͨͷ͸͍͔ͭ͋͘Δ͸ͣ • ෼ੳ༻్Ͱ࢖͏ϩάͷεΩʔϚ؅ཧʹ͍ͭͯ͸೰·͍͠ • json Ͱอଘ͠ɺඞཁʹԠͯ͡ json Λ෼ղͨ͠Ұ࣌ςʔϒϧΛ࡞ͬͯରॲ • σʔλ͸ࣗ༝ʹ৮ΕΔҰํɺύʔςΟγϣϯࢦఆͤͣɺϑϧεΩϟϯΫΤϦ࣮ߦ͠ଠ࿠ ͕ఆظతʹग़ݱ͢Δ • ϚωʔతͳҙຯͰ͏͔ͬΓ௧खΛ௥͏ͷ͸ආ͚͍͕ͨɺ͔ͩΒͱ͍ͬͯΞΫηεݖ Λୣ͏ͷ͸ҧ͏ؾ͕͢Δ 
  13. ·ͱΊ • Gunosy Ͱѻ͍ͬͯΔϩάΛɺʮ෼ੳ༻్ʯʮϓϩμΫτվળ༻్ʯ෼͚ ͯͦΕͧΕ͝঺հ͠·ͨ͠ • ෼ੳ༻్Ͱ͸ɺηΦϦʔ௨Γͷߏ੒ • S3 ->

    தؒσʔλετϨʔδ -> Re:dash • ओཁͳ KPI ͸ɺRe:dash Λ࢖ͬͯάϥϑԽ͠ɺேձͰຖ೔ڞ༗͍ͯ͠Δ • ϓϩμΫτվળ༻్Ͱ͸ɺόον + ϦΞϧλΠϜͰϩάΛ࢖͍ͬͯ·͢ • AWS ʹେ͖͘ॿ͚ΒΕ͍ͯ·͢