Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
OSC-Hokkaido-2018-hayabusa
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Hiroshi
July 07, 2018
Research
720
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
OSC-Hokkaido-2018-hayabusa
This is the presentation material for OSC Hokkaido 2018
Hiroshi
July 07, 2018
More Decks by Hiroshi
See All by Hiroshi
pepacon night : log research working group report
hirolovesbeer
0
1.5k
イベントネットワークにおけるsyslog分析でのElasticsearchの利用
hirolovesbeer
1
1.3k
Other Decks in Research
See All in Research
Dual Quadric表現を用いた動的物体追跡とRGB-D・IMU制約の密結合によるオドメトリ推定
nanoshimarobot
0
400
SAKURAONE:An Open Ethernet-based AI HPC System And Its Observed Workload Dynamicsin a Single-Tenant LLM Development Environment
yuukit
1
310
セマンティック通信勉強会 6Gに向けたデバイス間効率的な通信の技術紹介・課題・今後展望
satai
3
160
Scalable dynamic origin-destination demand estimation enhanced by high-resolution satellite imagery data
satai
3
260
Anthropic が提案する LLM の内部状態を自然言語で説明可能にした Natural Language Autoencoders / Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
shunk031
0
120
世界モデルにおける分布外データ対応の方法論
koukyo1994
7
2.2k
Ankylosing Spondylitis
ankh2054
0
170
多様なデータを許容し学習し続ける模倣学習 / Advanced Imitation Learning for VLA
prinlab
0
210
Claude Code × autoresearch 実践
mathbullet
0
150
Fukui Shibiten 39 - AI Art
butchi
0
120
Can We Teach Logical Reasoning to LLMs? – An Approach Using Synthetic Corpora (AAAI 2026 bridge keynote)
morishtr
1
250
第12回人と環境にやさしい交通をめざす全国大会/熊本都市圏「車1割削減、渋滞半減、公共交通2倍」をめざして
trafficbrain
0
110
Featured
See All Featured
How to Talk to Developers About Accessibility
jct
2
220
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Test your architecture with Archunit
thirion
1
2.3k
The Cult of Friendly URLs
andyhume
79
6.9k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
600
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.8k
Marketing to machines
jonoalderson
1
5.4k
End of SEO as We Know It (SMX Advanced Version)
ipullrank
3
4.2k
It's Worth the Effort
3n
188
29k
How to train your dragon (web standard)
notwaldorf
97
6.7k
WCS-LA-2024
lcolladotor
0
620
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
71
40k
Transcript
Hayabusa ߴʹશจݕࡧՄೳͳ OSSϩάݕࡧΤϯδϯͷ͝հ Ѩ෦ തɿגࣜձࣾϨϐμϜ ݚڀһ OSC 2018 Hokkaido 2018/07/08
ࣗݾհ • ໊લɿѨ෦ ത • ॴଐɿגࣜձࣾϨϐμϜʢݚڀһʣɺίίϯגࣜձࣾʢࣾิࠤ/ٕज़ݚڀ ॴ ݚڀһʣɺใ௨৴ݚڀػߏʢڠྗݚڀһʣɺઌՊֶٕज़େֶ Ӄେֶʢത࢜ޙظ՝ఔʣ •
ͦͷଞɿInterop Tokyo ShowNet NOCϝϯόʔ
࣍ • എܠͱత • Hayabusaʹ͍ͭͯ • ࢄHayabusaͷఏҊʢઃܭͱ࣮ʣ • ධՁ •
ߟ • ·ͱΊͱࠓޙͷ՝ !3
എܠͱత !4
Interop Tokyo ShowNet 2018 • 900Λ͑ΔཧɾԾػث܈ • ΄΅શͯͷػث͕syslogΛૹ৴ • ߏஙظؒʹड৴͢Δsyslogྔ
• 2ສ݅/ඵʢ20k/secʣ • 1ԯ̓ઍສ݅/ !5
ShowNetʹ͓͚Δϩάͷӡ༻ • େྔͷϩάΛੵ͢Δ • େྔͷϩά͔Βݕࡧ͢Δ • ΠϯγσϯτରԠͷͨΊʹϩάΛݕࡧ͢Δ • τϥϒϧγϡʔτͷͨΊʹϩάΛݕࡧ͢Δ •
ϩά͔Β౷ܭใΛऔಘ͢Δ • ߜΓࠐΜͩݕࡧใΛ౷ܭใͱͯ͠දࣔ͢Δ !6
طଘͷղܾࡦ • HadoopΤίγεςϜʢSpark, Impala, Hive, …ʣ • OSSʢElasticsearch + Kibana,
fluentd, …ʣ • ༻ϓϩμΫτʢSplunk, VMware Loginsight, …ʣ • ΫϥυαʔϏεʢGoogle BigQuery, Treasure Data, …ʣ !7
େ͖ͳ • ϩάͷߏԽ͕Ͱ͖ͳ͍ • ػࡐʹ౷Ұੑ͕ͳ͍ɾ࠷৽ͷϑΝʔϜ͗ͯ͢ใ͕ͳ͍ • ετϦʔϛϯάॲཧ͕͍͠ྲྀྔ • ϩάͷྲྀྔ͕ଟ͗ͯ͢ॲཧ͕͍͔ͭͳ͍ •
όονॲཧ͕͍͔ͭͳ͍ • όονॲཧ͕ࢦఆ࣌ؒʹऴΘΒͳ͍ • ࢄॲཧγεςϜ͕ෳࡶ͗͢Δ • ཧίετ͕ലେ !8
త • ܰྔʹߏஙɾӡ༻͕ߦ͑ΔγεςϜͷ࣮ݱ • γϯϓϧͰεέʔϧΞοϓՄೳͳγεςϜͷ࣮ݱ • ݕࡧੑೳ͕CPUʢίΞʣੑೳʹൺྫͯ͠ૣ͘ͳΔ • ෳࡶͳཧػߏΛඋ͑ͳ͍ !9
)BZBCVTBʹ͍ͭͯ !10
Hayabusaͱʁ • େྔͷϩάΛߴʹݕࡧ͢Δʢ17ԯϨίʔυͷશจݕࡧ͕5ඵʣ • ελϯυΞϩϯαʔόͰಈ࡞͢Δ • ϚϧνίΞΛ༗ޮʹ͍ɺߴͳฒྻݕࡧॲཧΛ࣮ݱ͢Δ
StoreEngine • σΟεΫʹॻ͖ࠐ·ΕͨϩάΛߴʹಡΈࠐΉ • ಡΈࠐΜͩϩάΛSQLite3ͷϑΝΠϧͱมʢ1ߦ1Ϩίʔυʣ • SQLite3ͷશจݕࡧʹಛԽͨ͠FTS(Full Text Search)ܗࣜͰinsert •
࣌ؒσΟϨΫτϦߏʹରԠ : /targetdir/yyyy/mm/dd/hh/min.db StoreEngine
SearchEngine • GNU ParallelΛ༻͍ͯSQLite3ϑΝΠϧฒྻݕࡧΛ͔͚Δ $ parallel sqlite3 ::: target files
::: “select count(*) from xxx where logs match ‘keyword’;” • ݕࡧ݁ՌΛUNIXύΠϓϥΠϯΛ༻͍ͯɺawkcountίϚϯυͰूܭ $ parallel sqlite3 ::: target files ::: “select count(*) from xxx where logs match ‘keyword’;” | awk ‘{m+=$1} END{print m;}’ SeachEngine !13
શจݕࡧੑೳ • Apache SparkͱͷൺֱʢελϯυΞϩϯڥʣ • Apache SparkͱͷൺֱʢSpark x 3 +
HDFS vs Hayabusa x 1ʣ Hayabusa͕ ̐ഒߴ Hayabusa͕ 27ഒߴ
OSSͱͯ͠ެ։ • GitHubʹͯެ։ • https://github.com/hirolovesbeer/hayabusa !15
Hayabusaͷ • ελϯυΞϩϯڥ • ੑೳΛ্͛ΔʹεέʔϧΞοϓ͔͠ͳ͍ • εέʔϧΞοϓίετ • ࢄॲཧγεςϜͱͷࠩ •
ن͕େ͖͘ͳΕࢄॲཧγεςϜͷॲཧ͘ͳΔ • Hayabusa͍͔ͭੑೳ͕ൈ͔ΕΔ !16
ࢄ)BZBCVTBͷఏҊʢઃܭͱ࣮ʣ !17
త • HayabusaΛࢄॲཧγεςϜͱਐԽͤ͞ॲཧΛεέʔϧΞτͤ͞Δ • ελϯυΞϩϯͷੑೳੜ͔͠ଓ͚Δ • ࢄॲཧγεςϜͰ͋Δ͕γϯϓϧͳઃܭΛࢤ͢ • σʔλΛෳ͢Δ͜ͱͰোੑΛߴΊΔ !18
GNU ParallelͷϦϞʔτ࣮ߦ • ཧ : GNU ParallelͷϦϞʔτ࣮ߦΛར༻͢Εࢄ࣮ߦՄೳ $ time parallel
—controlmaster -S host1,host2,host3 sqlite3 ::: … • ݱ࣮ : sshͷΦʔόϔου͕͔͔Γॲཧ͕Ԇ ϗετ͕૿͑Δͱॲཧ͕࣌ؒ૿͑Δ
ఏҊख๏ • ࢄݕࡧ • ࣮ߦ͢ΔݕࡧॲཧΛRPCͱͯ͠HayabusaૹΓࠐΉ • ݁ՌΛRPCͷϨεϙϯεͱͯ͠ड͚औΓूܭ͢Δ • ฒྻੵ •
શͯͷϗετಉҰͷϦΫΤετ͕ಧ͍ͯಉ݁͡ՌΛฦ͢Α͏ʹ͢Δ • ࣄલʹશॲཧϗετͱϩάσʔλΛෳ͢Δ !20
ࢄHayabusaΞʔΩςΫνϟશ༰
ฒྻੵ • syslogΛෳϗετͱෳ͢Δ • શϗετͰಉҰͷsyslogΛड৴ • UDP SamplicatorʢOSSʣͷར༻ • syslogύέοτͷෳͱసૹ
• ෳॲཧͷίΞεέʔϧԽ • UDP SmaplicatorͷϚϧνϓϩηεԽ !22 syslogͷෳ
UDP SamplicatorͷϚϧνϓϩηεԽ • ϘτϧωοΫʹͳΓ͕ͪͳϓϩηεΛίΞεέʔϧ • SO_REUSEPORTΛར༻ͨ͠ϚϧνϓϩηεԽ • ͜ΕʹΑΓUDP 514ϙʔτ͕ෳϓϩηεͰγΣΞ͞ΕΔ socketΦϓγϣϯͷՃ
ۉʹsyslogసૹͷෛՙ͕ όϥϯε͞ΕΔ !23
ࢄݕࡧ • RPC • Producer / ConsumerϞσϧͷ࠾༻ • ࣮ •
ZeroMQͷPush / Pullύλʔϯ • ϦΫΤετͷϩʔυόϥϯε • Push / PullύλʔϯۉҰʹϦΫΤετΛϗετ͢Δ ZeroMQͷPush / Pullύλʔϯ !24
ࢄݕࡧ • ZeroMQΫϥΠΞϯτ • VentilatorͱSinkͷׂ • ZeroMQϫʔΧ • ड͚औͬͨॲཧϦΫΤετ Λ࣮ߦͯ݁͠ՌΛฦ͢
!25
ॲཧϦΫΤετ • ϦΫΤετ $ parallel sqlite3 ::: target files :::
“select count(*) from xxx where logs match ‘keyword’;” | awk ‘{m+=$1} END{print m;}’ ੨ࣈ : GNU ParallelͷίϚϯυΛ֤ॲཧϗετૹΓࠐΉ ࣈ : ΫϥΠΞϯτϗετͰ·ͱΊ͋͛Δ !26
΄΅ຊͳٙࣅίʔυ • ΫϥΠΞϯτ • Worker ࣮ߦίϚϯυ ίϚϯυΛ ϫʔΧૹ৴ ίϚϯυΛ࣮ߦ ݁ՌΛΫϥΠΞϯτૹ৴
݁ՌΛड͚λʔϛφϧදࣔ !27
ධՁ !28
࣮ݧڥ • Amazon Web Service (AWS) • EC2Πϯελϯε : c4.4xlarge
• vCPU : Xeon E5-2666 v3 @ 2.90GHz x 16 cores • ϝϞϦ : 30GB • σΟεΫʢEBSʣ : SSD 8GB (OS) + SSD 50GB (Data) • OS : Ubuntu 16.04.3 LTS (Xenial Xerus) !29
ࢄݕࡧ • ݕࡧͷ݅ • 1ͷσʔλʹରͯ͠100ճϦΫΤετΛ࣮ߦ͢Δ • 1ͷσʔλϑΝΠϧ60ʢ60ϑΝΠϧʣ x 24࣌ؒ =
1,440ϑΝΠϧ • 1ϑΝΠϧ͋ͨΓͷϨίʔυ10ສ݅ʢ1,440 x 10ສʹ1ԯ4400ສϨίʔυʣ • 100ճͷϦΫΤετͰ144ԯϨίʔυ͕ରͱͳΔ • ࣮ߦ͢ΔSQLจҎԼͰશจݕࡧͱΧϯτ • select count(*) from syslog where logs match ‘keyword’; !30
ࢄݕࡧʢϗετεέʔϧΞτʣ • ϗετΛ1͔Β10૿Ճͤ͞Δ • 1Ͱ249ඵ͔Β10Ͱ39ඵ·Ͱॖʢ10ճࢼߦฏۉʣ
ࢄݕࡧʢϗετεέʔϧΞτʣ • ϗετΛ1͔Β10૿Ճͤ͞Δ • 1Ͱ249ඵ͔Β10Ͱ39ඵ·Ͱॖʢ10ճࢼߦฏۉʣ Ϋϥυڥෆ҆ఆ ʢϕετΤϑΥʔτʣ
ࢄݕࡧʢWorkerεέʔϧΞτʣ • ϗετ10ɺ͔ͭ1͋ͨΓͷϫʔΧΛ1͔Β16·Ͱ૿Ճͤ͞Δ • 1ϗετ1 worker 249ඵ͔Β10ϗετ10 workerͰ6.8ඵ·Ͱॖ ͜ͷลΓ͕࠷ *0ڝ߹͕ى͖Δ͔Β͔
͔ΘΒͣ
݁Ռͷ·ͱΊ • ॲཧੑೳ • ϗετ10ͷ߹ : ϗετ1ͷ10ഒૣ͘ͳΔʢ249ඵ -> 39ඵʣ •
ϗετ10ͰϫʔΧΛ૿Ճ : ૯ϫʔΧ10ʙ160Ͱ 249ඵ -> 6.8ඵ • ϨίʔυΛϑϧεΩϟϯˍશจݕࡧͨ݁͠Ռ • 144ԯϨίʔυ͔ΒඞཁͳσʔλΛൈ͖ग़͢ͷʹ6.8ඵ·ͰߴԽ • 10ͷϗετͰ36ഒͷߴԽΛ࣮ݱ !34
Amazon Elastic MapReduceͱͷൺֱ • Amazon EMR : ΠϯελϯεHayabusaͱಉ͡c4.4xlarge • ߏ1Ϛελʔϊʔυ
+ 10 ίΞϊʔυ • σʔλͷΞΫηε • EMR͔ΒAmazon S3μΠϨΫτʹ ΞΫηε • શจݕࡧͷํ๏ • ϚελʔϊʔυͷPySpark͔Βߦ͏ JNQPSUUJNF GSPNQZTQBSLTRMJNQPSU42-$POUFYU TRM$POUFYU42-$POUFYU TD MJOFTTDUFYU'JMF TBCFXPSLTTECFODINBSLMPH pMFTLL MPH MJOFTDBDIF GPSJJOSBOHF TUBSUUJNFUJNF <MJOFTpMUFS MBNCEBTOPDJO T DPVOU GPSJJOSBOHF >FMBQTFE@UJNFUJNFUJNF TUBSUQSJOUFMBQTFE@UJNF 1Z4QBSLͰ࣮ߦ͢Δίʔυ
Amazon Elastic Mapreduceͱͷൺֱ • ࣮ߦ݁Ռ • 10ͷߏͰ17ഒHayabusaͷํ͕ߴʹಈ࡞
ߟ !37
ݕࡧͷεέʔϧΞτ • 144ԯ͔ΒඞཁͳσʔλΛൈ͖ग़͢ͷʹ6.8ඵ·ͰߴԽ • 2લͷBigQueryͷϑϧεΩϟϯ͕120ԯϨίʔυͰ5ඵ • 10ͷϗετͰ36ഒͷߴԽΛ࣮ݱ • BigQueryԿඦɺԿઍͷϗετ͕ಉ࣌ʹಈ͍͍ͯΔ͔ෆ໌ •
Amazon Elastic MapReduceͱͷൺֱ • 10ͷߏͰ17ഒHayabusaͷํ͕ߴʹશจݕࡧՄೳ • γεςϜͷίετΛߟ͑ͨ߹ • ϦʔζφϒϧͰߴੑೳͳࢄݕࡧॲཧ͕࣮ݱͰ͖ͨ !38
ੵͷฒྻԽ • syslogͷෳͷ • େྔͷσʔλʢύέοτʣͷෳͰଳҬΛѹഭ͢Δ • ຊདྷͰ͋ΕHDFSͷΑ͏ʹࢄϑΝΠϧγεςϜΛ͏͖ • ϝλσʔλػߏΛܦ༝ͯ͠σʔλʹΞΫηε͢ΔͨΊຊ࣭తʹ͘ͳΔ •
ࢄϑΝΠϧγεςϜͱ͍ͯ͠ʢҰͭͷݚڀʣ • γϯϓϧ͞ͷٻͷ݁Ռ • อ࣋σʔλ͕ػثͷނোͰফࣦͨ͠ͱͯ͠ෳ͕ΔɾނোػΛ֎͚ͩ͢ • ࢄϑΝΠϧγεςϜͷΑ͏ʹ࠶ஔॲཧ͕ෆཁ !39
γϯϓϧͳઃܭʹΑΔӡ༻ͷ؆ུԽ • ࢄݕࡧ • Procedure / ConsumerϞσϧͰ࣮ݱ • ϓϩηε࣮ߦεέδϡʔϥGNU Parallelʹґଘ
• ෳࡶͳࢄγεςϜΛΘͳ͍ར • τϥϒϧѲͷߴԽ • γεςϜӡ༻ෛՙͷܰݮ !40
ߴԽͷ؊ • ׂΓΓઃܭ • ϦτϥΠॲཧ/Τϥʔॲཧະ࣮ • εέδϡʔϥ • ZeroMQͱGnu Parallelʹ͓ͤ
• ετϨʔδ • ࢄอଘͤͣ͞ʹෳΛอ࣋
ϋʔυΣΞʹґଘ͢Δ • CPU Core • ૣ͚Εૣ͍΄Ͳྑ͍ • CoreͷΑΓΫϩοΫ͕ͦͦ͜͜ૣ͍ํ͕͕ग़Δ͜ͱ͋Δ • σΟεΫ
• SSDNVMeʢͦΓΌૣ͍ʹܾ·͍ͬͯΔʣ • I/OੑೳΛҾ͖ग़͢
ଞͷγεςϜͱͷൺֱ • શจݕࡧͰApache Sparkͱൺֱͨ͠ • Elasticsearchͱͷൺֱʁ • Ͳ͏ͬͯൺΔʁ • ΤϯδϯͷʁʢElasticsearchͱͯૣ͍ʣ
• ݺͼग़͠APIͷՃຯ͢ΔʁʢREST APIݺͼग़͠ͱ͍ͯʣ • Write & Read • ॻ͖ͳ͕ΒಡΈࠐΜͩ߹ʁ
·ͱΊͱࠓޙͷ՝ !44
·ͱΊ • HayabusaͷࢄγεςϜԽͷઃܭͱ࣮ • 144ԯϨίʔυͷsyslogϑϧεΩϟϯˍશจݕࡧΛ6.8ඵͰ࣮ݱ • ϚϧνϕϯμػثΛରͱͨ͠ɺେྔͷෆἧ͍ͳϩάΛߴʹݕࡧՄೳ • τϥϒϧγϡʔτɾΠϯγσϯτϨεϙϯεΛஶ͘͠ॖ͢ΔՄೳੑ •
γϯϓϧͳࢄॲཧߏʹΑΔཧͷ༰қੑ !45
ࠓޙͷ՝ • ଞͷιϑτΣΞͱͷൺֱʢBigQuery, ElasticSearch, Splunkʣ • HayabusaͱଞͷΞϓϦέʔγϣϯͱͷ༥߹ʢΞϊϚϦݕͳͲʣ • Hayabusaͱ౷ܭॲཧϥΠϒϥϦػցֶशϥΠϒϥϦͱͷ݁߹ •
ࢄϑΝΠϧγεςϜɾࢄετϨʔδͷ࣮ !46
ँࣙ • ຊݚڀͷҰ෦ɺࠃཱݚڀ։ൃ๏ਓՊֶٕज़ৼڵػߏʢJSTʣͷݚڀՌ ൃలࣄۀʮઓུతݚڀਪਐࣄۀʢCRESTʣJPMJCR1783ʯͷࢧԉʹ ΑͬͯߦΘΕͨ
None