Upgrade to Pro — share decks privately, control downloads, hide ads and more …

teratailの解析基盤をEFKで作っていろいろ楽しい話

ikuwow
March 04, 2016

 teratailの解析基盤をEFKで作っていろいろ楽しい話

teratailの解析基盤をEFKで作っていろいろ楽しい話 @ ゆとりエンジニア交流会

ikuwow

March 04, 2016
Tweet

More Decks by ikuwow

Other Decks in Technology

Transcript

 1. teratailͷղੳج൫Λ EFKͰ࡞ͬͯ ͍Ζ͍Ζָ͍͠࿩ @ikuwow ϨόϨδʔζגࣜձࣾɹςΫϊϩδʔϝσΟΞϥϘ ΏͱΓੈ୅ΤϯδχΞަྲྀձʢ2016/03/04ʣ

 2. ࣗݾ঺հ • ϨόϨδʔζגࣜձࣾɺςΫϊϩδʔϝ σΟΞϥϘɺteratailͷ։ൃͯ͠Δਓɻ • ֶੜͷ࣌εϩʔΨϯגࣜձࣾͰ1.5೥͙ Β͍Πϯλʔϯͯͨ͠ • ίʔυॻ͘ͱ͖͸PHPͰ͕͢ɺϑϩϯτ ΋Πϯϑϥ΋΍ͬͨΓ͍Ζ͍Ζ΍Γ·

  ͢ • ࠷ۙ΍ͬͨ͜ͱɿteratailͷϩάղੳج ൫࡞Δ @ikuwow
 3. teratail ஌ͬͯΔਓʙʁ

 4. teratail • ΤϯδχΞɾϓϩάϥ ϚͷͨΊͷQ&AαΠτ • ຖ೔࣭໰͕70-80݅ • ճ౴཰໿93% • 3/17ʹϢʔβʔձʮू

  ·ͬtailʯୈ࢛ճ։࠵༧ ఆ
 5. ࠓ೔࿩͢͜ͱ • teratailͷϢʔβʔߦಈϩάΛEFKελοΫ (Elasticsearch, Fluentd, KibanaʣͰՄࢹԽ͢ Δ࢓૊Έ࡞ͬͨ • ָ͍͠ʂ •

  ਏ͍ʂʂ
 6. ϢʔβʔͷߦಈΛݟ͍ͨʂ 1. ϦΞϧλΠϜʹ؂ࢹͯ͠ϦεΫݕ஌ͨ͠Γɺ ΧδϡΞϧʹ࠷ۙͷϢʔβʔͷಈ͖Λ௥ͬͨ Γ͍ͨ͠ʂ 2. KPIΛݟΔͷʹ࠷దԽͨ͠ܗͰσʔλΛ࣋ͬ ͯਂ͘ૣ͘ՄࢹԽ͍ͨ͠ ʢ͋ͱHiveQLॻ͘ͷΊΜͲ͍͘͢͠͝஗͍͔Βૣ͍ͷʹ͍ͨ͠ɾɾɾʣ

 7. ࡞ͬͨج൫ Amazon S3 Amazon Redshift ໼ҹ͸ϩάͷྲྀΕ 1. ϦΞϧλΠϜՄࢹԽ 2. ਂ͘ՄࢹԽ

 8. ΋͏গ͚ͩ͠ৄ͘͠ node.master: false node.data: false node.master: true node.data: true node.master:

  false node.data: false node.master: true node.data: true Amazon Redshift Amazon S3 teratailͷதͷਓ ४ϦΞϧλΠϜՄࢹԽ KPIΛਂ͘ՄࢹԽ όονॲཧ
 9. Fluentdͱ͸ • ϩάͷύʔεɺू໿Λ͢Δπʔ ϧ • Treasuredata੡ʢ೔ຊͰΘΓͱ ਓؾʣ • Α͘Logstashͱൺֱ͞ΕΔ •

  όοϑΝϦϯάݡͯ͘ɺ5෼͙ Β͍ࢭΊͯ΋શ͘໰୊ͳ͍
 10. Elasticsearchͱ͸ • ࠷ۙྲྀߦΓͷશจݕࡧΤϯδϯɻ2ܥ ͕࠷৽ɻ • Elasticࣾ੡ʢLogstashͱಉ͡ʣ • ͖Ε͍ʹRESTfulͳAPIͰѻ͍΍͍͢ • ͱΓ͋͑ͣಉ͡ωοτϫʔΫʹஔ͍

  ͓͚ͯ͹Ϋϥελ࡞ͬͯ͘ΕΔ • ࠷ۙAWS͕Elasticsearch Serviceͱ ͍͏ͷΛग़ͨ͠Γ
 11. Kibanaͱ͸ • ElasticsearchΛόοΫͱ͠ ͯɺͦΕΒͷσʔλΛ௒͔ͬ ͜ྑ͘ՄࢹԽ͢Δπʔϧ • nodeΞϓϦέʔγϣϯͳͷ Ͱಋೖָ͕͘͢͝ • Ϛ΢εϙνϙνͰϩά͕ݟΒ

  ΕΔ
 12. EFKελοΫͷಛ௃ • Πϯετʔϧ΍؅ཧ͕ൺֱతΧϯλϯ • Fluentd͸ϫϯϥΠφʔ͚ͩͰ͍͚Δ • Elasticsearch͸উखʹ͏·͍͜ͱΫϥελ࡞ͬͯ͘ΕΔ • Kibana΋ೖΕΔͷ؆୯ͩ͠ݟͨΒ͍͍ͩͨ࢖͑Δ •

  ͦͦ͜͜ރΕ͖ͯͨײ͋Δʁ • ࢼͯ͠ΈΔͱ࢖͍΍͕͙͢͢͞
 13. ࡞ͬͯԿ͕มΘ͔ͬͨʁ • ϩά͕௒؆୯ʹૣ͔ͬ͘͜Α͘ݟΒΕΔ༷ʹͳͬͨ • ࣌ؒͷ୹ॖ • νʔϜશһʹɺ͍ܰؾ࣋ͪͰ͍͍͢͢ϩάΛूܭɾՄࢹԽɾ෼ੳ͢ Δश׳͕͍ͭͯɺΠϕϯτࣄͷͨͼʹߦಈྔ૿͑ͨΓ͢Δͷ͕Έͯ ָ͍͠ •

  ϩάʹײ৘ҠೖͰ͖ΔΑ͏ʹͳͬͨʂ • ͓໰͍߹Θͤ࣌ʹࠔͬͯΔϢʔβʔͷߦಈΛ௥͑ΔΑ͏ʹͳͬͨ • όάͷݪҼ͕ɺϩά͔ΒϢʔβʔͷಈ͖Λ࠶ݱͯ͠ΈͨΒ൑໌ͨ͠
 14. ָ͍͠ʂ

 15. ΄͔ʹ΍Γ͍ͨ͜ͱ • ApacheͷΤϥʔϩάɺΞΫηεϩάͷՄࢹԽɾ෼ੳ • fluentdͰTemplate͕༻ҙ͞Ε͍ͯΔͷͰ௒؆୯ʹͰ͖Δ • ϨεϙϯελΠϜͱ͔ग़͓ͯ͘͠ͱ΋ͬͱָ͍͠ • ΞϓϦέʔγϣϯϑϨʔϜϫʔΫͷΤϥʔϩά •

  Fluentd͸ෳ਺ߦϩά΋͍͚Δ • slow queryͷϩάݟͯΨϯΨϯѱ͍ΫΤϦΛ௵͢ ϦΞϧλΠϜੑ͕ٻΊΒΕΔ৘ใΛݟ΍͍͔͢Β͘͢͝Ԡ༻ར͘
 16. ਏ͔ͬͨ͜ͱ • HadoopʹೖΕ͍ͯͨಠࣗͷϑΥʔϚοτΛਖ਼ن දݱͰॻ͘ͷͭΒ͍ • ϩά͕1.3%͙Β͍ܽଛ͢Δ => ࣏ͬͨ • Index

  template͚ͭͨΒಡΊͳ͍ͬͯݴΘΕΔ • Autoscaling͕ݡ͗ͯ͢࢖ͬͯͨͷterminate͞Εͨ
 17. <source> @type tail path /home/ikuo.degawa/hogehoge.logs pos_file /tmp/hogehoge.logs.pos format /^(?<dt>[^\t]+)\t(?<site_id>[^\t]*)\t(?<action>[^\t]*)\t(? <option>[^\t]*)\t(?<user_id>[^\t]*)\t(?<session_cookie>[^\t]*)\t(?

  <storage_cookie>[^\t]*)\t(?<view_type>[^\t]*)\t(?<user_agent>[^\t]*)\t(? <page_id>[^\t]*)\t(?<url>[^\t]*)\t(?<time>[^\t]*)\t(?<ip>[^\t]*)\t(? <segment>[^\t]*)\t(?<var>[^\t]*)\t(?<view>[^\t]*)\t(?<act>[^\t]*)\t(?<post0>[^ \u0001]*)\u0001(?<post1>[^\u0001]*)\u0001(?<post2>[^\t]*)\t(?<search0>[^ \u0001]*)\u0001(?<search1>[^\u0001]*)\u0001(?<search2>[^\u0001]*)\u0001(? <search3>[^\u0001]*)\u0001(?<search4>[^\u0001]*)\u0001(?<search5>[^\u0001]*) \u0001(?<search6>[^\u0001]*)\u0001(?<search7>[^\t]*)\t(?<user0>[^\u0001]*) \u0001(?<user1>[^\u0001]*)\u0001(?<user2>[^\u0001]*)\u0001(?<user3>[^\t]*)\t(? <other0>[^\u0001]*)\u0001(?<other1>[^\u0001]*)\u0001(?<other2>.*)$/ tag mogmog-logs.gerogero </source> HadoopʹೖΕ͍ͯͨಠࣗͷϑΥʔ ϚοτΛਖ਼نදݱͰॻ͘ͷͭΒ͍
 18. ϩά͕1.3%͙Β͍ܽଛ͢Δ => ࣏ͬͨ • Kibanaͷ݅਺ͱɺcat hoge.log | wc -l ͨ݁͠Ռ

  ͕ҧ͏ʂʂ • lotateͨ͠ઌͷϑΝΠϧΛ ಡΈ࢝ΊΔλΠϛϯά͕ ஗͍ͱ͍͏࢓༷Λൃݟ • read_from_headΛ࢖ͬͨ Β࣏ͬͨ લͷ ࣍ͷ ͜ͷล͔ΒಡΜͰͨ
 19. Index template͚ͭͨΒಡΊͳ ͍ͬͯݴΘΕΔ • index template: elasticsearchʹೖΔ ࣌ͷmappingΛࢦ ఆͰ͖Δ •

  index໊Λ৚݅ʹܕ ΛܾΊΒΕΔ { "templates": “awesomelog-*", "settings": { "number_of_shards" : 1 }, "mappings": { "awesomelogs" : { "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_millis" }, "act0" : { "type" : "integer" }, "act1" : { "type" : "integer" }, "act10" : { "type" : "string", "index": "not_analyzed" }, "act11" : { "type" : "string" }, "act2" : { "type" : "integer" }, "act3" : { "type" : "integer" }, "act4" : { "type" : "string" }, "act5" : { "type": "multi_field", "fields": {
 20. ύϑΥʔϚϯε্͕Δͱࢥͬͨ Βɾɾɾ { "templates": “awesomelog-*", "settings": { "number_of_shards" : 1

  }, "mappings": { "awesomelogs" : { "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_millis" }, "act0" : { "type" : "integer" }, "act1" : { "type" : "integer" }, "act10" : { "type" : "string", "index": "not_analyzed" }, "act11" : { "type" : "string" }, "act2" : { "type" : "integer" }, "act3" : { "type" : "integer" }, "act4" : { "type" : "string" }, "act5" : { "type": "multi_field", "fields": { • ࣮͸intΛظ଴͍ͯ͠Δͱ͜ ΖʹstringඈΜͰ͖ͨΓ͠ ͯͨʢϩάͷ࣮૷ϛεʣ • ϩά͕ೖͬͨͱ͖ʹΤϥʔ ు͍ͯͯɺfluentdͷόο ϑΝʹཷ·Γଓ͚ͯͨ • ݁ہnot_analyzedΛ͚ͭͨ ͷΈ
 21. Autoscaling͕ݡ͗ͯ͢terminate ͞Εͨ ʂʁ

 22. ʮavailability zone͕௥Ճ͞Ε͔ͨΒɺόϥϯε Αͯ͘͠Մ༻ੑ͋͛ΔͨΊʹ͍ͬ͜ফͯ࣍͠ͷ ݐͯΔΑʂʯ

 23. ڭ܇ɾɾɾ • Fluentd͸͓ੈ࿩গͳͯ͘ࡁΉ͕ɺϩάͷಡΈ ํΛ஌ͬͱ͚ • Elasticsearch͸Elasticʹ͓͍ͯͨ͠΄͏͕͍͍ • Auto Scaling Group͸ݡ͍

 24. ·ͱΊ • KibanaͰϩάΛ͔ͬ͜Α͘ݟΒΕΔͱσʔλ ʹײ৘ҠೖͰ͖ΔΑ͏ʹͳΓɺνʔϜશһ͕ ϢʔβʔͷߦಈΛݟΒΕΔਓʹͳΕΔ • ָ͍͠

 25. ฐࣾͰ͸ΤϯδχΞΛืूதͰ͢ ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠

 26. ͜ͷຊʹ͸͓ੈ࿩ʹͳΓ·ͨ͠ • ͍͍ຊͰ͢

 27. @ikuwow