Upgrade to Pro — share decks privately, control downloads, hide ads and more …

teratailの解析基盤をEFKで作っていろいろ楽しい話

46d57e2c568a65ade212f56539687bc2?s=47 ikuwow
March 04, 2016

 teratailの解析基盤をEFKで作っていろいろ楽しい話

teratailの解析基盤をEFKで作っていろいろ楽しい話 @ ゆとりエンジニア交流会

46d57e2c568a65ade212f56539687bc2?s=128

ikuwow

March 04, 2016
Tweet

Transcript

  1. teratailͷղੳج൫Λ EFKͰ࡞ͬͯ ͍Ζ͍Ζָ͍͠࿩ @ikuwow ϨόϨδʔζגࣜձࣾɹςΫϊϩδʔϝσΟΞϥϘ ΏͱΓੈ୅ΤϯδχΞަྲྀձʢ2016/03/04ʣ

  2. ࣗݾ঺հ • ϨόϨδʔζגࣜձࣾɺςΫϊϩδʔϝ σΟΞϥϘɺteratailͷ։ൃͯ͠Δਓɻ • ֶੜͷ࣌εϩʔΨϯגࣜձࣾͰ1.5೥͙ Β͍Πϯλʔϯͯͨ͠ • ίʔυॻ͘ͱ͖͸PHPͰ͕͢ɺϑϩϯτ ΋Πϯϑϥ΋΍ͬͨΓ͍Ζ͍Ζ΍Γ·

    ͢ • ࠷ۙ΍ͬͨ͜ͱɿteratailͷϩάղੳج ൫࡞Δ @ikuwow
  3. teratail ஌ͬͯΔਓʙʁ

  4. teratail • ΤϯδχΞɾϓϩάϥ ϚͷͨΊͷQ&AαΠτ • ຖ೔࣭໰͕70-80݅ • ճ౴཰໿93% • 3/17ʹϢʔβʔձʮू

    ·ͬtailʯୈ࢛ճ։࠵༧ ఆ
  5. ࠓ೔࿩͢͜ͱ • teratailͷϢʔβʔߦಈϩάΛEFKελοΫ (Elasticsearch, Fluentd, KibanaʣͰՄࢹԽ͢ Δ࢓૊Έ࡞ͬͨ • ָ͍͠ʂ •

    ਏ͍ʂʂ
  6. ϢʔβʔͷߦಈΛݟ͍ͨʂ 1. ϦΞϧλΠϜʹ؂ࢹͯ͠ϦεΫݕ஌ͨ͠Γɺ ΧδϡΞϧʹ࠷ۙͷϢʔβʔͷಈ͖Λ௥ͬͨ Γ͍ͨ͠ʂ 2. KPIΛݟΔͷʹ࠷దԽͨ͠ܗͰσʔλΛ࣋ͬ ͯਂ͘ૣ͘ՄࢹԽ͍ͨ͠ ʢ͋ͱHiveQLॻ͘ͷΊΜͲ͍͘͢͠͝஗͍͔Βૣ͍ͷʹ͍ͨ͠ɾɾɾʣ

  7. ࡞ͬͨج൫ Amazon S3 Amazon Redshift ໼ҹ͸ϩάͷྲྀΕ 1. ϦΞϧλΠϜՄࢹԽ 2. ਂ͘ՄࢹԽ

  8. ΋͏গ͚ͩ͠ৄ͘͠ node.master: false node.data: false node.master: true node.data: true node.master:

    false node.data: false node.master: true node.data: true Amazon Redshift Amazon S3 teratailͷதͷਓ ४ϦΞϧλΠϜՄࢹԽ KPIΛਂ͘ՄࢹԽ όονॲཧ
  9. Fluentdͱ͸ • ϩάͷύʔεɺू໿Λ͢Δπʔ ϧ • Treasuredata੡ʢ೔ຊͰΘΓͱ ਓؾʣ • Α͘Logstashͱൺֱ͞ΕΔ •

    όοϑΝϦϯάݡͯ͘ɺ5෼͙ Β͍ࢭΊͯ΋શ͘໰୊ͳ͍
  10. Elasticsearchͱ͸ • ࠷ۙྲྀߦΓͷશจݕࡧΤϯδϯɻ2ܥ ͕࠷৽ɻ • Elasticࣾ੡ʢLogstashͱಉ͡ʣ • ͖Ε͍ʹRESTfulͳAPIͰѻ͍΍͍͢ • ͱΓ͋͑ͣಉ͡ωοτϫʔΫʹஔ͍

    ͓͚ͯ͹Ϋϥελ࡞ͬͯ͘ΕΔ • ࠷ۙAWS͕Elasticsearch Serviceͱ ͍͏ͷΛग़ͨ͠Γ
  11. Kibanaͱ͸ • ElasticsearchΛόοΫͱ͠ ͯɺͦΕΒͷσʔλΛ௒͔ͬ ͜ྑ͘ՄࢹԽ͢Δπʔϧ • nodeΞϓϦέʔγϣϯͳͷ Ͱಋೖָ͕͘͢͝ • Ϛ΢εϙνϙνͰϩά͕ݟΒ

    ΕΔ
  12. EFKελοΫͷಛ௃ • Πϯετʔϧ΍؅ཧ͕ൺֱతΧϯλϯ • Fluentd͸ϫϯϥΠφʔ͚ͩͰ͍͚Δ • Elasticsearch͸উखʹ͏·͍͜ͱΫϥελ࡞ͬͯ͘ΕΔ • Kibana΋ೖΕΔͷ؆୯ͩ͠ݟͨΒ͍͍ͩͨ࢖͑Δ •

    ͦͦ͜͜ރΕ͖ͯͨײ͋Δʁ • ࢼͯ͠ΈΔͱ࢖͍΍͕͙͢͢͞
  13. ࡞ͬͯԿ͕มΘ͔ͬͨʁ • ϩά͕௒؆୯ʹૣ͔ͬ͘͜Α͘ݟΒΕΔ༷ʹͳͬͨ • ࣌ؒͷ୹ॖ • νʔϜશһʹɺ͍ܰؾ࣋ͪͰ͍͍͢͢ϩάΛूܭɾՄࢹԽɾ෼ੳ͢ Δश׳͕͍ͭͯɺΠϕϯτࣄͷͨͼʹߦಈྔ૿͑ͨΓ͢Δͷ͕Έͯ ָ͍͠ •

    ϩάʹײ৘ҠೖͰ͖ΔΑ͏ʹͳͬͨʂ • ͓໰͍߹Θͤ࣌ʹࠔͬͯΔϢʔβʔͷߦಈΛ௥͑ΔΑ͏ʹͳͬͨ • όάͷݪҼ͕ɺϩά͔ΒϢʔβʔͷಈ͖Λ࠶ݱͯ͠ΈͨΒ൑໌ͨ͠
  14. ָ͍͠ʂ

  15. ΄͔ʹ΍Γ͍ͨ͜ͱ • ApacheͷΤϥʔϩάɺΞΫηεϩάͷՄࢹԽɾ෼ੳ • fluentdͰTemplate͕༻ҙ͞Ε͍ͯΔͷͰ௒؆୯ʹͰ͖Δ • ϨεϙϯελΠϜͱ͔ग़͓ͯ͘͠ͱ΋ͬͱָ͍͠ • ΞϓϦέʔγϣϯϑϨʔϜϫʔΫͷΤϥʔϩά •

    Fluentd͸ෳ਺ߦϩά΋͍͚Δ • slow queryͷϩάݟͯΨϯΨϯѱ͍ΫΤϦΛ௵͢ ϦΞϧλΠϜੑ͕ٻΊΒΕΔ৘ใΛݟ΍͍͔͢Β͘͢͝Ԡ༻ར͘
  16. ਏ͔ͬͨ͜ͱ • HadoopʹೖΕ͍ͯͨಠࣗͷϑΥʔϚοτΛਖ਼ن දݱͰॻ͘ͷͭΒ͍ • ϩά͕1.3%͙Β͍ܽଛ͢Δ => ࣏ͬͨ • Index

    template͚ͭͨΒಡΊͳ͍ͬͯݴΘΕΔ • Autoscaling͕ݡ͗ͯ͢࢖ͬͯͨͷterminate͞Εͨ
  17. <source> @type tail path /home/ikuo.degawa/hogehoge.logs pos_file /tmp/hogehoge.logs.pos format /^(?<dt>[^\t]+)\t(?<site_id>[^\t]*)\t(?<action>[^\t]*)\t(? <option>[^\t]*)\t(?<user_id>[^\t]*)\t(?<session_cookie>[^\t]*)\t(?

    <storage_cookie>[^\t]*)\t(?<view_type>[^\t]*)\t(?<user_agent>[^\t]*)\t(? <page_id>[^\t]*)\t(?<url>[^\t]*)\t(?<time>[^\t]*)\t(?<ip>[^\t]*)\t(? <segment>[^\t]*)\t(?<var>[^\t]*)\t(?<view>[^\t]*)\t(?<act>[^\t]*)\t(?<post0>[^ \u0001]*)\u0001(?<post1>[^\u0001]*)\u0001(?<post2>[^\t]*)\t(?<search0>[^ \u0001]*)\u0001(?<search1>[^\u0001]*)\u0001(?<search2>[^\u0001]*)\u0001(? <search3>[^\u0001]*)\u0001(?<search4>[^\u0001]*)\u0001(?<search5>[^\u0001]*) \u0001(?<search6>[^\u0001]*)\u0001(?<search7>[^\t]*)\t(?<user0>[^\u0001]*) \u0001(?<user1>[^\u0001]*)\u0001(?<user2>[^\u0001]*)\u0001(?<user3>[^\t]*)\t(? <other0>[^\u0001]*)\u0001(?<other1>[^\u0001]*)\u0001(?<other2>.*)$/ tag mogmog-logs.gerogero </source> HadoopʹೖΕ͍ͯͨಠࣗͷϑΥʔ ϚοτΛਖ਼نදݱͰॻ͘ͷͭΒ͍
  18. ϩά͕1.3%͙Β͍ܽଛ͢Δ => ࣏ͬͨ • Kibanaͷ݅਺ͱɺcat hoge.log | wc -l ͨ݁͠Ռ

    ͕ҧ͏ʂʂ • lotateͨ͠ઌͷϑΝΠϧΛ ಡΈ࢝ΊΔλΠϛϯά͕ ஗͍ͱ͍͏࢓༷Λൃݟ • read_from_headΛ࢖ͬͨ Β࣏ͬͨ લͷ ࣍ͷ ͜ͷล͔ΒಡΜͰͨ
  19. Index template͚ͭͨΒಡΊͳ ͍ͬͯݴΘΕΔ • index template: elasticsearchʹೖΔ ࣌ͷmappingΛࢦ ఆͰ͖Δ •

    index໊Λ৚݅ʹܕ ΛܾΊΒΕΔ { "templates": “awesomelog-*", "settings": { "number_of_shards" : 1 }, "mappings": { "awesomelogs" : { "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_millis" }, "act0" : { "type" : "integer" }, "act1" : { "type" : "integer" }, "act10" : { "type" : "string", "index": "not_analyzed" }, "act11" : { "type" : "string" }, "act2" : { "type" : "integer" }, "act3" : { "type" : "integer" }, "act4" : { "type" : "string" }, "act5" : { "type": "multi_field", "fields": {
  20. ύϑΥʔϚϯε্͕Δͱࢥͬͨ Βɾɾɾ { "templates": “awesomelog-*", "settings": { "number_of_shards" : 1

    }, "mappings": { "awesomelogs" : { "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_millis" }, "act0" : { "type" : "integer" }, "act1" : { "type" : "integer" }, "act10" : { "type" : "string", "index": "not_analyzed" }, "act11" : { "type" : "string" }, "act2" : { "type" : "integer" }, "act3" : { "type" : "integer" }, "act4" : { "type" : "string" }, "act5" : { "type": "multi_field", "fields": { • ࣮͸intΛظ଴͍ͯ͠Δͱ͜ ΖʹstringඈΜͰ͖ͨΓ͠ ͯͨʢϩάͷ࣮૷ϛεʣ • ϩά͕ೖͬͨͱ͖ʹΤϥʔ ు͍ͯͯɺfluentdͷόο ϑΝʹཷ·Γଓ͚ͯͨ • ݁ہnot_analyzedΛ͚ͭͨ ͷΈ
  21. Autoscaling͕ݡ͗ͯ͢terminate ͞Εͨ ʂʁ

  22. ʮavailability zone͕௥Ճ͞Ε͔ͨΒɺόϥϯε Αͯ͘͠Մ༻ੑ͋͛ΔͨΊʹ͍ͬ͜ফͯ࣍͠ͷ ݐͯΔΑʂʯ

  23. ڭ܇ɾɾɾ • Fluentd͸͓ੈ࿩গͳͯ͘ࡁΉ͕ɺϩάͷಡΈ ํΛ஌ͬͱ͚ • Elasticsearch͸Elasticʹ͓͍ͯͨ͠΄͏͕͍͍ • Auto Scaling Group͸ݡ͍

  24. ·ͱΊ • KibanaͰϩάΛ͔ͬ͜Α͘ݟΒΕΔͱσʔλ ʹײ৘ҠೖͰ͖ΔΑ͏ʹͳΓɺνʔϜશһ͕ ϢʔβʔͷߦಈΛݟΒΕΔਓʹͳΕΔ • ָ͍͠

  25. ฐࣾͰ͸ΤϯδχΞΛืूதͰ͢ ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠

  26. ͜ͷຊʹ͸͓ੈ࿩ʹͳΓ·ͨ͠ • ͍͍ຊͰ͢

  27. @ikuwow