Slide 1

Slide 1 text

Tatsuhiko Kubo@cubicdaiya σʔλ෼ੳج൫Night #2 2017/04/26 ͷσʔλ෼ੳج൫ / mercari log analysis

Slide 2

Slide 2 text

@cubicdaiya / Tatsuhiko Kubo Principal Engineer, SRE @ Mercari, Inc.

Slide 3

Slide 3 text

Agenda • ϝϧΧϦͷσʔλ෼ੳج൫ͷ঺հ • Server log analysis infrastructure • Event based log analysis infrastructure • Machine learning analysis infrastructure

Slide 4

Slide 4 text

σʔλ෼ੳʹؔ͢Δ໾ׂ෼୲ @ • BI AnalystʢData Scientistʣ • ෼ੳʹඞཁͳϩάσʔλߏ଄ͷઃܭ • σʔλ෼ੳʹج͍ͮͨاը΍ఏҊɺKPIઃܭͱϨϙʔςΟϯάɾՄࢹԽ • Software Engineer (Backend System, Site Reliability) • σʔλ෼ੳج൫ͷ։ൃɺӡ༻ɺαϙʔτ • Software Engineer (Machine Learning / NLP / AI) • (ओʹ)ػցֶशΛ͸͡Ίͱ͢Δઐ໳ٕज़ͱαʔϏεͷڮ౉͠Λߦ͏ • ݚڀ։ൃ΋݉ͶΔ

Slide 5

Slide 5 text

Agenda • ϝϧΧϦͷσʔλ෼ੳج൫ • Server log analysis infrastructure • Event based log analysis infrastructure • Machine learning analysis infrastructure

Slide 6

Slide 6 text

Server log analysis infrastructure app app app access_log application_log app_error_log error_log php_log... AWS S Check to make sure you recent set of AWS Simple This version was last upda (v1.4) Find the most recen aws.amazon.com/architect Usage Guidelines DEC 01 BigQuery AWS Check to make sure y recent set of AWS Sim This version was last u (v1.4) Find the most re aws.amazon.com/arch Always use Icon labe always include a label b the group in Arial. The Usage Guidelines DEC 01 Mackerel A Check to recent se This vers (v1.4) Fin aws.ama Always u always in the group Usage Guidel DEC 01 Slack Stream Processing batch Filtering & Import logs to BigQuery

Slide 7

Slide 7 text

Server log analysis infrastructure • ֤αʔόͷϩάΛFluentdͰऩूɾసૹ • ༻్ʹԠ֤ͯ͡αʔϏε΍ϛυϧ΢ΣΞʹ౤ೖ • BigQueryɿ෼ੳ༻ͷϩά͸શ෦͜͜ʹूΊΔ • NorikraɿSQLʹΑΔετϦʔϛϯάॲཧ • etc…ʢe.g. KibanaɺKPI reportingʣ

Slide 8

Slide 8 text

• όονͰϩάϑΝΠϧΛΠϯϙʔτ • over 1TB / day • ϩάϑΝΠϧࣗମ͸GCSɺS3ʹόοΫΞοϓ • Google Cloud SDK & AWS CLI & Embulk Google BigQuery

Slide 9

Slide 9 text

• σʔλ෼ੳͷى఺ • ։ൃऀ͕ௐࠪ໨తͰΫΤϦΛ౤͛Δ • μογϡϘʔυ΍֤छεϓϨουγʔτͷσʔλιʔε • ͦͷଞϩάσʔλΛ׆༻ͨ͠಺෦޲͚αʔϏεͰར༻ • ఆֹϓϥϯΛར༻ • ΫΤϦͷྉۚΛؾʹ͠ͳͯ͘Α͍ • ͨͩ͠ɺεϩοτ͸༗ݶͳͷͰௐࢠʹ৐ͬͯॏ͍ΫΤϦΛ౤͛ա͗Δͱେ෯ʹ஗Ԇ • தؒςʔϒϧΛ࡞੒͢Δ͜ͱͰॲཧྔΛ࡟ݮ • εϩοτͷར༻ྔ͸StackdriverͰ֬ೝͰ͖Δʢᮢ஋Ξϥʔτ΋Մʣ Google BigQuery

Slide 10

Slide 10 text

• ChartioΛར༻ • https://chartio.com/ • Ϋϥ΢υܕͷBIαʔϏε • ৭ʑͳσʔλιʔε͔Β μογϡϘʔυΛ࡞੒ Dashboard

Slide 11

Slide 11 text

• Google Spread Sheet • ͔ΒσʔλΛμ΢ϯϩʔυͯ͠ूܭɺάϥϑԽ • Excelܗࣜ͸׳Ε͍ͯΔਓ͕ଟ͍ͷͰɺඇΤϯδχΞͱͷڞ༗͕ḿΔ • Google App Script • ొ࿥ͨ͠ΫΤϦΛఆظతʹࣗಈ࣮ߦͯ͠ΦϨΦϨμογϡϘʔυੜ੒ • ੜ੒ͨ͠άϥϑը૾Λ ʹ౤ߘ Google Spread Sheet & App Script

Slide 12

Slide 12 text

• ਺෼ͷ΢Οϯυ΢ͰSQLʹΑΔूܭॲཧ • APIͷϦΫΤετ਺ / minͷάϥϑԽ • 1෼ຖʹnginxͷΩϟογϡώοτ཰ΛMackerelʹ౤͛ͯάϥϑԽ • ̋෼ؒʹ˚݅Ҏ্ಛఆͷΤϥʔ͕ग़ͨΒSlackʹ௨஌ • etc… • Powered by fluentd-plugin-(mackerel|slack|norikra) Norikra

Slide 13

Slide 13 text

Visualize and Alerting by log analysis Mackerel Slack Filter Aggregate Summarize by SQL Visualize Alerting

Slide 14

Slide 14 text

Stream processing by SQL 4&-&$5 $06/5  VQTUSFBN@DBDIF@TUBUVT)*5 $06/5  "4SBUF@IJU  $06/5  VQTUSFBN@DBDIF@TUBUVT.*44 $06/5  "4SBUF@NJTT  $06/5  VQTUSFBN@DBDIF@TUBUVT&91*3&% $06/5  "4SBUF@FYQJSFE '30.NFSDBSJ@TPNF@MPHXJOUJNF@CBUDI NJO 8)&3&VSJlTPNFBQJz ͱ͋ΔnginxͷΩϟογϡώοτ཰ / min

Slide 15

Slide 15 text

Agenda • ϝϧΧϦͷσʔλ෼ੳج൫ • Server log analysis infrastructure • Event based log analysis infrastructure • Machine learning analysis infrastructure

Slide 16

Slide 16 text

0QFO3FTUZ 0QFO3FTUZ 0QFO3FTUZ Developer Data Sientist Analyze by SQL send events send events send events Powered by cookpad/puree-(ios|android) utilize events utilize events utilize events hydra(※) hydra(※) hydra(※) (※) fluent-agent-hydra Pascal - Event based log analysis infrastructure in_tail & out_forward BigQuery

Slide 17

Slide 17 text

• Only Server log analysis infrastructure࣌୅ͷ՝୊(౰࣌) • ֤छKPI΍෼ੳʹඞཁͳݩσʔλ͕෼ࢄ • ϩά͕ੜͷঢ়ଶʹ͍ۙͷͰ࢖͍ͮΒ͍(ίπ͕͍Δ) • ֎෦ͷ෼ੳπʔϧͩͱखܰʹͰ͖ΔҰํͰࡉ͔͘ௐ΂ͨΓɺෳ ਺ͷσʔλ΍πʔϧͱ૊Έ߹ΘͤΔͷ͕೉͍͠ • ෼ੳʹదͨ͠ϩάΛҰ͔Βઃܭɾूܭͯ͠෼ੳπʔϧͱ૊Έ߹Θͤ ͯ࢖͑ΔΑ͏ʹ͠Α͏ʂ • ৽͍͠ϩά෼ੳج൫Λߏங͢Δ͜ͱʹ Pascal - Event based log analysis infrastructure

Slide 18

Slide 18 text

0QFO3FTUZ 0QFO3FTUZ 0QFO3FTUZ Developer Data Sientist Analyze by SQL send events send events send events Powered by cookpad/puree-(ios|android) utilize events utilize events utilize events hydra(※) hydra(※) hydra(※) (※) fluent-agent-hydra Pascal - Event based log analysis infrastructure in_tail & out_forward BigQuery

Slide 19

Slide 19 text

• over 10,000 records / sec (not requests /sec) • ΠϕϯτϕʔεͷϩάΛνϟϯωϧ୯ҐͰू໿ɾసૹ • ΞϓϦ্ͷΠϕϯτϩάʢྫɿλοϓʣ • ։෧ϩά • ABςετϩά • etc… • ΞϓϦ͔Β͚ͩͰͳ֤͘छαϒγεςϜ͔Βͷϩά΋ू໿ɾసૹ Pascal - Event based log analysis infrastructure

Slide 20

Slide 20 text

• over 10,000 records / sec (not requests /sec) • ΠϕϯτϕʔεͷϩάΛνϟϯωϧ୯ҐͰू໿ɾసૹ • ΞϓϦ্ͷΠϕϯτϩάʢྫɿλοϓʣ • ։෧ϩά • ABςετϩά • etc… • ΞϓϦ͔Β͚ͩͰͳ֤͘छαϒγεςϜ͔Βͷϩά΋ू໿ɾసૹ Pascal - Event based log analysis infrastructure

Slide 21

Slide 21 text

A/B Testing new features • A/BςετͷϑϨʔϜϫʔΫԽʹΑΔॊೈੑͷ֬อ • ਺े݅୯ҐͰA/BςετΛಉ࣌ਐߦ • A/BςετҎ֎ͷར༻ͷ࢓ํ΋͋Δ • ஈ֊తϦϦʔεʢ10% -> 50% -> 100%) • ػೳࣗମͷOn/Off • ݁Ռͷ෼ੳ͸Google BigQuery

Slide 22

Slide 22 text

Agenda • ϝϧΧϦͷσʔλ෼ੳج൫ • Server log analysis infrastructure • Event based log analysis infrastructure • Machine learning analysis infrastructure

Slide 23

Slide 23 text

• ٕज़ελοΫ • PythonɺDjangoɺscikit-learnɺTensorFlow • BigQuery্ͷσʔλΛݩʹ • σϞάϥϑΟοΫਪఆ • ΧςΰϦਪఆ • ϥϕϦϯά • ৭ʑͳՕॴ͔Βར༻Ͱ͖ΔΑ͏ʹAPIͱͯ͠ఏڙ • ڈ೥ͷ฻Ε͙Β͍ʹઐ೚νʔϜ͕Ͱ͖ͯຊ֨తʹՔಇத Machine learning analysis infrastructure

Slide 24

Slide 24 text

Summary • ϝϧΧϦͷσʔλ෼ੳج൫ • Server log analysis infrastructure • Event based log analysis infrastructure • Machine learning analysis infrastructure • ϩάσʔλ෼ੳͷى఺͸Google BigQuery • ৭ʑͳπʔϧ΍αʔϏεͱ࿈ܞ • A/Bςετ͸࢓૊Έ͓ͯ͘͠ͱ৭ʑͱԠ༻͕ར͘ • ػցֶशΛར༻ͨ͠γεςϜ΋ຊ֨తʹՔಇ։࢝

Slide 25

Slide 25 text

We are hiring! https://www.mercari.com/jp/jobs/