Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
teratailの解析基盤をEFKで作っていろいろ楽しい話
Search
ikuwow
March 04, 2016
Technology
0
880
teratailの解析基盤をEFKで作っていろいろ楽しい話
teratailの解析基盤をEFKで作っていろいろ楽しい話 @ ゆとりエンジニア交流会
ikuwow
March 04, 2016
Tweet
Share
More Decks by ikuwow
See All by ikuwow
Elasticsearch on EC2からAmazon Elasticsearch Serviceに 移行してだいぶ楽になった話
ikuwow
0
3.5k
意外と使える! Alibaba Cloud
ikuwow
0
240
UNIXという考え方
ikuwow
1
2k
技術書紹介 パーフェクトPHP
ikuwow
0
2.1k
みんなもMiddlemanで技術ブログ作って幸せになろう!
ikuwow
0
980
PHPサイバーテロの技法 書籍紹介
ikuwow
0
960
Other Decks in Technology
See All in Technology
ESXi のAIOps だ!2025冬
unnowataru
0
470
Introduction to Sansan, inc / Sansan Global Development Center, Inc.
sansan33
PRO
0
2.9k
_第4回__AIxIoTビジネス共創ラボ紹介資料_20251203.pdf
iotcomjpadmin
0
170
AI との良い付き合い方を僕らは誰も知らない (WSS 2026 静岡版)
asei
1
210
なぜ あなたはそんなに re:Invent に行くのか?
miu_crescent
PRO
0
250
わが10年の叡智をぶつけたカオスなクラウドインフラが、なくなるということ。
sogaoh
PRO
1
190
会社紹介資料 / Sansan Company Profile
sansan33
PRO
11
390k
AI駆動開発ライフサイクル(AI-DLC)の始め方
ryansbcho79
0
290
Oracle Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
1
820
人工知能のための哲学塾 ニューロフィロソフィ篇 第零夜 「ニューロフィロソフィとは何か?」
miyayou
0
330
自己管理型チームと個人のセルフマネジメント 〜モチベーション編〜
kakehashi
PRO
5
1.6k
Introduction to Bill One Development Engineer
sansan33
PRO
0
340
Featured
See All Featured
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
Being A Developer After 40
akosma
91
590k
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
35
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
2.8k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
BBQ
matthewcrist
89
9.9k
Facilitating Awesome Meetings
lara
57
6.7k
How to Grow Your eCommerce with AI & Automation
katarinadahlin
PRO
0
84
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.3k
We Are The Robots
honzajavorek
0
130
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.1k
Transcript
teratailͷղੳج൫Λ EFKͰ࡞ͬͯ ͍Ζ͍Ζָ͍͠ @ikuwow ϨόϨδʔζגࣜձࣾɹςΫϊϩδʔϝσΟΞϥϘ ΏͱΓੈΤϯδχΞަྲྀձʢ2016/03/04ʣ
ࣗݾհ • ϨόϨδʔζגࣜձࣾɺςΫϊϩδʔϝ σΟΞϥϘɺteratailͷ։ൃͯ͠Δਓɻ • ֶੜͷ࣌εϩʔΨϯגࣜձࣾͰ1.5͙ Β͍Πϯλʔϯͯͨ͠ • ίʔυॻ͘ͱ͖PHPͰ͕͢ɺϑϩϯτ ΠϯϑϥͬͨΓ͍Ζ͍ΖΓ·
͢ • ࠷ۙͬͨ͜ͱɿteratailͷϩάղੳج ൫࡞Δ @ikuwow
teratail ͬͯΔਓʙʁ
teratail • ΤϯδχΞɾϓϩάϥ ϚͷͨΊͷQ&AαΠτ • ຖ࣭͕70-80݅ • ճ93% • 3/17ʹϢʔβʔձʮू
·ͬtailʯୈ࢛ճ։࠵༧ ఆ
ࠓ͢͜ͱ • teratailͷϢʔβʔߦಈϩάΛEFKελοΫ (Elasticsearch, Fluentd, KibanaʣͰՄࢹԽ͢ ΔΈ࡞ͬͨ • ָ͍͠ʂ •
ਏ͍ʂʂ
ϢʔβʔͷߦಈΛݟ͍ͨʂ 1. ϦΞϧλΠϜʹࢹͯ͠ϦεΫݕͨ͠Γɺ ΧδϡΞϧʹ࠷ۙͷϢʔβʔͷಈ͖Λͬͨ Γ͍ͨ͠ʂ 2. KPIΛݟΔͷʹ࠷దԽͨ͠ܗͰσʔλΛ࣋ͬ ͯਂ͘ૣ͘ՄࢹԽ͍ͨ͠ ʢ͋ͱHiveQLॻ͘ͷΊΜͲ͍͍͔͘͢͠͝Βૣ͍ͷʹ͍ͨ͠ɾɾɾʣ
࡞ͬͨج൫ Amazon S3 Amazon Redshift ҹϩάͷྲྀΕ 1. ϦΞϧλΠϜՄࢹԽ 2. ਂ͘ՄࢹԽ
͏গ͚ͩ͠ৄ͘͠ node.master: false node.data: false node.master: true node.data: true node.master:
false node.data: false node.master: true node.data: true Amazon Redshift Amazon S3 teratailͷதͷਓ ४ϦΞϧλΠϜՄࢹԽ KPIΛਂ͘ՄࢹԽ όονॲཧ
Fluentdͱ • ϩάͷύʔεɺूΛ͢Δπʔ ϧ • TreasuredataʢຊͰΘΓͱ ਓؾʣ • Α͘Logstashͱൺֱ͞ΕΔ •
όοϑΝϦϯάݡͯ͘ɺ5͙ Β͍ࢭΊͯશ͘ͳ͍
Elasticsearchͱ • ࠷ۙྲྀߦΓͷશจݕࡧΤϯδϯɻ2ܥ ͕࠷৽ɻ • ElasticࣾʢLogstashͱಉ͡ʣ • ͖Ε͍ʹRESTfulͳAPIͰѻ͍͍͢ • ͱΓ͋͑ͣಉ͡ωοτϫʔΫʹஔ͍
͓͚ͯΫϥελ࡞ͬͯ͘ΕΔ • ࠷ۙAWS͕Elasticsearch Serviceͱ ͍͏ͷΛग़ͨ͠Γ
Kibanaͱ • ElasticsearchΛόοΫͱ͠ ͯɺͦΕΒͷσʔλΛ͔ͬ ͜ྑ͘ՄࢹԽ͢Δπʔϧ • nodeΞϓϦέʔγϣϯͳͷ Ͱಋೖָ͕͘͢͝ • ϚεϙνϙνͰϩά͕ݟΒ
ΕΔ
EFKελοΫͷಛ • Πϯετʔϧཧ͕ൺֱతΧϯλϯ • FluentdϫϯϥΠφʔ͚ͩͰ͍͚Δ • Elasticsearchউखʹ͏·͍͜ͱΫϥελ࡞ͬͯ͘ΕΔ • KibanaೖΕΔͷ؆୯ͩ͠ݟͨΒ͍͍ͩͨ͑Δ •
ͦͦ͜͜ރΕ͖ͯͨײ͋Δʁ • ࢼͯ͠ΈΔͱ͍͕͙͢͢͞
࡞ͬͯԿ͕มΘ͔ͬͨʁ • ϩά͕؆୯ʹૣ͔ͬ͘͜Α͘ݟΒΕΔ༷ʹͳͬͨ • ࣌ؒͷॖ • νʔϜશһʹɺ͍ܰؾ࣋ͪͰ͍͍͢͢ϩάΛूܭɾՄࢹԽɾੳ͢ Δश׳͕͍ͭͯɺΠϕϯτࣄͷͨͼʹߦಈྔ૿͑ͨΓ͢Δͷ͕Έͯ ָ͍͠ •
ϩάʹײҠೖͰ͖ΔΑ͏ʹͳͬͨʂ • ͓͍߹Θͤ࣌ʹࠔͬͯΔϢʔβʔͷߦಈΛ͑ΔΑ͏ʹͳͬͨ • όάͷݪҼ͕ɺϩά͔ΒϢʔβʔͷಈ͖Λ࠶ݱͯ͠ΈͨΒ໌ͨ͠
ָ͍͠ʂ
΄͔ʹΓ͍ͨ͜ͱ • ApacheͷΤϥʔϩάɺΞΫηεϩάͷՄࢹԽɾੳ • fluentdͰTemplate͕༻ҙ͞Ε͍ͯΔͷͰ؆୯ʹͰ͖Δ • ϨεϙϯελΠϜͱ͔ग़͓ͯ͘͠ͱͬͱָ͍͠ • ΞϓϦέʔγϣϯϑϨʔϜϫʔΫͷΤϥʔϩά •
Fluentdෳߦϩά͍͚Δ • slow queryͷϩάݟͯΨϯΨϯѱ͍ΫΤϦΛ௵͢ ϦΞϧλΠϜੑ͕ٻΊΒΕΔใΛݟ͍͔͢Β͘͢͝Ԡ༻ར͘
ਏ͔ͬͨ͜ͱ • HadoopʹೖΕ͍ͯͨಠࣗͷϑΥʔϚοτΛਖ਼ن දݱͰॻ͘ͷͭΒ͍ • ϩά͕1.3%͙Β͍ܽଛ͢Δ => ࣏ͬͨ • Index
template͚ͭͨΒಡΊͳ͍ͬͯݴΘΕΔ • Autoscaling͕ݡ͗ͯͬͯͨ͢ͷterminate͞Εͨ
<source> @type tail path /home/ikuo.degawa/hogehoge.logs pos_file /tmp/hogehoge.logs.pos format /^(?<dt>[^\t]+)\t(?<site_id>[^\t]*)\t(?<action>[^\t]*)\t(? <option>[^\t]*)\t(?<user_id>[^\t]*)\t(?<session_cookie>[^\t]*)\t(?
<storage_cookie>[^\t]*)\t(?<view_type>[^\t]*)\t(?<user_agent>[^\t]*)\t(? <page_id>[^\t]*)\t(?<url>[^\t]*)\t(?<time>[^\t]*)\t(?<ip>[^\t]*)\t(? <segment>[^\t]*)\t(?<var>[^\t]*)\t(?<view>[^\t]*)\t(?<act>[^\t]*)\t(?<post0>[^ \u0001]*)\u0001(?<post1>[^\u0001]*)\u0001(?<post2>[^\t]*)\t(?<search0>[^ \u0001]*)\u0001(?<search1>[^\u0001]*)\u0001(?<search2>[^\u0001]*)\u0001(? <search3>[^\u0001]*)\u0001(?<search4>[^\u0001]*)\u0001(?<search5>[^\u0001]*) \u0001(?<search6>[^\u0001]*)\u0001(?<search7>[^\t]*)\t(?<user0>[^\u0001]*) \u0001(?<user1>[^\u0001]*)\u0001(?<user2>[^\u0001]*)\u0001(?<user3>[^\t]*)\t(? <other0>[^\u0001]*)\u0001(?<other1>[^\u0001]*)\u0001(?<other2>.*)$/ tag mogmog-logs.gerogero </source> HadoopʹೖΕ͍ͯͨಠࣗͷϑΥʔ ϚοτΛਖ਼نදݱͰॻ͘ͷͭΒ͍
ϩά͕1.3%͙Β͍ܽଛ͢Δ => ࣏ͬͨ • Kibanaͷ݅ͱɺcat hoge.log | wc -l ͨ݁͠Ռ
͕ҧ͏ʂʂ • lotateͨ͠ઌͷϑΝΠϧΛ ಡΈ࢝ΊΔλΠϛϯά͕ ͍ͱ͍͏༷Λൃݟ • read_from_headΛͬͨ Β࣏ͬͨ લͷ ࣍ͷ ͜ͷล͔ΒಡΜͰͨ
Index template͚ͭͨΒಡΊͳ ͍ͬͯݴΘΕΔ • index template: elasticsearchʹೖΔ ࣌ͷmappingΛࢦ ఆͰ͖Δ •
index໊Λ݅ʹܕ ΛܾΊΒΕΔ { "templates": “awesomelog-*", "settings": { "number_of_shards" : 1 }, "mappings": { "awesomelogs" : { "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_millis" }, "act0" : { "type" : "integer" }, "act1" : { "type" : "integer" }, "act10" : { "type" : "string", "index": "not_analyzed" }, "act11" : { "type" : "string" }, "act2" : { "type" : "integer" }, "act3" : { "type" : "integer" }, "act4" : { "type" : "string" }, "act5" : { "type": "multi_field", "fields": {
ύϑΥʔϚϯε্͕Δͱࢥͬͨ Βɾɾɾ { "templates": “awesomelog-*", "settings": { "number_of_shards" : 1
}, "mappings": { "awesomelogs" : { "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_millis" }, "act0" : { "type" : "integer" }, "act1" : { "type" : "integer" }, "act10" : { "type" : "string", "index": "not_analyzed" }, "act11" : { "type" : "string" }, "act2" : { "type" : "integer" }, "act3" : { "type" : "integer" }, "act4" : { "type" : "string" }, "act5" : { "type": "multi_field", "fields": { • ࣮intΛظ͍ͯ͠Δͱ͜ ΖʹstringඈΜͰ͖ͨΓ͠ ͯͨʢϩάͷ࣮ϛεʣ • ϩά͕ೖͬͨͱ͖ʹΤϥʔ ు͍ͯͯɺfluentdͷόο ϑΝʹཷ·Γଓ͚ͯͨ • ݁ہnot_analyzedΛ͚ͭͨ ͷΈ
Autoscaling͕ݡ͗ͯ͢terminate ͞Εͨ ʂʁ
ʮavailability zone͕Ճ͞Ε͔ͨΒɺόϥϯε Αͯ͘͠Մ༻ੑ͋͛ΔͨΊʹ͍ͬ͜ফͯ࣍͠ͷ ݐͯΔΑʂʯ
ڭ܇ɾɾɾ • Fluentd͓ੈগͳͯ͘ࡁΉ͕ɺϩάͷಡΈ ํΛͬͱ͚ • ElasticsearchElasticʹ͓͍ͯͨ͠΄͏͕͍͍ • Auto Scaling Groupݡ͍
·ͱΊ • KibanaͰϩάΛ͔ͬ͜Α͘ݟΒΕΔͱσʔλ ʹײҠೖͰ͖ΔΑ͏ʹͳΓɺνʔϜશһ͕ ϢʔβʔͷߦಈΛݟΒΕΔਓʹͳΕΔ • ָ͍͠
ฐࣾͰΤϯδχΞΛืूதͰ͢ ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠
͜ͷຊʹ͓ੈʹͳΓ·ͨ͠ • ͍͍ຊͰ͢
@ikuwow