Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
大規模Webサービス入門 7回目 / Introduction to large scale ...
Search
muttan
August 25, 2017
0
120
大規模Webサービス入門 7回目 / Introduction to large scale web service 7
muttan
August 25, 2017
Tweet
Share
More Decks by muttan
See All by muttan
さわやか待ち時間LINE botを作った話 / Sawayaka LINE bot
bath_poo_
0
110
コンテナ開発入門 1回目/Introduction to Container Development 1
bath_poo_
0
160
ISUCONってなんだ / What is ISUCON
bath_poo_
0
350
Web技術の基本 8回目 / Introduction to Web technologies 8th class
bath_poo_
0
190
Web技術の基本 7回目 / Introduction to Web technologies 7th class
bath_poo_
0
160
Web技術の基本 6回目 / Introduction to Web technologies 6th class
bath_poo_
1
260
Web技術の基本 5回目 / Introduction to Web technologies 5th class
bath_poo_
0
140
Web技術の基本 4回目 / Introduction to Web technologies 4th class
bath_poo_
0
220
Web技術の基本 3回目 / Introduction to Web technologies 3rd class
bath_poo_
0
250
Featured
See All Featured
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
21
1.3k
Docker and Python
trallard
45
3.5k
The Pragmatic Product Professional
lauravandoore
35
6.7k
Designing for Performance
lara
610
69k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
Optimising Largest Contentful Paint
csswizardry
37
3.3k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
It's Worth the Effort
3n
185
28k
For a Future-Friendly Web
brad_frost
179
9.8k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Agile that works and the tools we love
rasmusluckow
329
21k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Transcript
େنαʔϏεٕज़ೖ ୈ7ճ ISUCONରࡦษڧձ 2017/8/25
ୈ7ճ ΞϧΰϦζϜͷ࣮༻Խ - ۙͳྫͰݟΔཧɾݚڀͷ࣮ફೖ-
ΞϧΰϦζϜɾσʔλߏͷબ • େنͳΣϒαʔϏεʹ͓͍ͯ, ۪ʹσʔ λΛૢ࡞ɾ୳ࡧ͢Δͱܭࢉྔരൃ͕ى͜Δ.
ΞϧΰϦζϜɾσʔλߏͷબ ʮϑΧγΪͷ͑ํʯ͓Ͷ͑͞Μͱ͍ͬ͠ΐʂΈΜͳͰ͑ͯΈΑ͏ʂ https://www.youtube.com/watch?v=Q4gTV4r0zRs
ΞϧΰϦζϜɾσʔλߏͷબ • దͳΞϧΰϦζϜΛ༻͍Δ͜ͱͰ, େ෯ʹܭ ࢉྔΛݮΒ͢͜ͱ͕ग़དྷΔ.
None
ΞϧΰϦζϜɾσʔλߏͷબ • ࠓճֶͿ͜ͱ ‣ ΞϧΰϦζϜͷॏཁ͞ - σʔλ͕େ͖͘ͳΕͳΔ΄ͲॏཁʹͳΔ ‣ ΞϧΰϦζϜબఆεςοϓ -
Φʔμʔه๏ - ࣮ࡍʹΘΕΔ·Ͱͷεςοϓ
Lesson 19 ΞϧΰϦζϜͱධՁ
σʔλͷنͱܭࢉྔͷҧ͍ • σʔλ͕େ͖͘ͳΕͳΔ΄Ͳ, ΞϧΰϦζϜ σʔλߏͷબ͕ͱͯ͠ݱΕΔ. αʔνํ๏ Φʔμʔ ઢܗ୳ࡧ ೋ୳ࡧ
σʔλͷنͱܭࢉྔͷҧ͍ • n=1000ͷͱ͖ͷܭࢉྔͷҧ͍ αʔνํ๏ ܭࢉྔ ઢܗ୳ࡧ ೋ୳ࡧ ͓Αͦ
σʔλͷنͱܭࢉྔͷҧ͍ • n=1,000,000ͷͱ͖ αʔνํ๏ ܭࢉྔ ઢܗ୳ࡧ ೋ୳ࡧ
͓Αͦ O(log2(n))σʔλͷ૿Ճʹڧ͍
σʔλͷنͱܭࢉྔͷҧ͍ • σʔλྔ͕গͳ͍࣌, ઢܗ୳ࡧͰྑ͍. • ͔͠͠, ݱ࣮ͷେنͳWebαʔϏεͰ, σʔ λྔ͕গͳ͍ͱ͍͏͜ͱߟ͑ʹ͍͘ •
ઌఔͷΑ͏ʹ, σʔλ͕100ສ݅, 1000ສ݅ͱ૿Ճ ͍ͯ͘͠ͱ, σʔλͷ୳ࡧ͕ϘτϧωοΫͱͳͬͯ ͘Δ.ɹ→ ͳΔ͘ݮΒ͍ͨ͠
ΞϧΰϦζϜͱʁ • ͋Δ·ͨͷू߹Λೖྗ͠, ͋Δ·ͨͷ ू߹Λग़ྗ͢Δ, ໌֬ʹఆٛ͞Εͨܭࢉखଓ͖ͷ͜ ͱ. • ࣮ࡍʹڱٛͷҙຯͰͷΞϧΰϦζϜ, ٛͷҙຯ
ͰͷΞϧΰϦζϜ͕ଘࡏ͢Δ. • ຊ࣭తͳ͜ͱͰͳ͍ͷͰলུ
ΞϧΰϦζϜΛֶͿҙٛ • ܭࢉࢿݯ༗ݶͰ͋Δ ‣ ༗ݶͷϦιʔεͰ͍͔ʹޮΑ͘ॲཧ͢Δ͔ • ΞϧΰϦζϜΤϯδχΞͷʮڞ௨ݴޠʯ ‣ ΞϧΰϦζϜΛ͍ͬͯΔલఏͰΛ͢Δ •
͕ࣝ͋Ε৽͍͠ʹରॲͰ͖Δ
ΞϧΰϦζϜͷධՁ • ΞϧΰϦζϜΛධՁ͢ΔͨΊʹ, Φʔμʔදه ͱ͍͏ͷ͕ΘΕΔ. • ೖྗͷαΠζΛnͱͨ͠ͱ͖ʹ, Ͳͷఔͷܭ ࢉྔ͕ඞཁʹͳΔ͔Λද͢ͷ. •
O(n)ͱ͍͏දهΛ͢Δ.
ΞϧΰϦζϜͷධՁ • ྫ͑nͷେ͖͞ʹ͔͔ΘΒͣఆ࣌ؒͰॲཧ ͕ऴΘΔ߹ʢϋογϡͳͲʣ ɹ→ O(1)ͱදه͢Δ • ೖྗαΠζnʹΑͬͯ୳ࡧ͕࣌ؒมԽ͢Δͷ ʢઢܗ୳ࡧͳͲʣ ɹ→
O(n), O(nlogn), O(logn), etc…
ΞϧΰϦζϜͷධՁ • ΦʔμʔදهͰҎԼͷΑ͏ͳܭࢉྔ͕සग़͢Δ. • O(nlogn)͙Β͍·Ͱ͍͍͕, ͦΕҎ্ʹͳΔͱ ٸܹʹܭࢉྔ͕૿Ճ͢Δ. ‣ ࣮༻ʹת͑ΔͷO(nlogn)͙Β͍·Ͱͱߟ͑Δ.
ΞϧΰϦζϜͷධՁ • ΞϧΰϦζϜʹΑͬͯ, O(n^2)ͩͱͯ͠ߴͰ ͋Δͱݴ͑Δ. • ݁ہͷͱ͜Ζ, ॲཧରͷαΠζ(nͷαΠζ)ʹΑͬ ͯߴ͔Ͳ͏͔ܾ·ͬͯ͘Δ. •
ۭؒ༻ྔʢϝϞϦʣΛ͡Δͱ͖ʹΦʔμʔ දهΘΕ͍ͯΔ.
ΞϧΰϦζϜͱσʔλߏ • σʔλߏ, ʮରͱ͢ΔσʔλΛอ࣋͢Δ·ͨ දݱ͢ΔͨΊͷߏʯ ‣ ྻߏ͕Θ͔Γ͍͢ྫ • ΞϧΰϦζϜʹదͨ͠σʔλߏΛ͏͜ͱͰ, ΑΓ
ޮతʹॲཧ͢Δ͜ͱ͕ग़དྷΔ. • σʔλϕʔεͰB+Λ͍ͬͯΔ.
ܭࢉྔͱఆ߲ • ΦʔμʔදهͰ, ఆ߲ແࢹ͞ΕΔ. ‣ ܭࢉίετ͕3nͷ߹, O(n) ‣ ܭࢉίετ͕2n^2ͷ߹, O(n^2)
• ؔݺͼग़͠, Λฦ͢, ifͰذఆ߲ѻ͍ ͞Ε͍ͯΔ.
ܭࢉྔͱఆ߲ • ΦʔμʔදهΞϧΰϦζϜͦͷͷΛධՁɾൺ ֱ͢Δࡍʹศར. • ࣮ࡍͷ, ଞͷܭࢉػతͳཁҼʹࠨӈ͞ΕΔ. ‣ ΩϟογϡʹࡌΓ͍͔͢Ͳ͏͔ ‣
ذ͕গͳ͍͔Ͳ͏͔
ܭࢉྔͱఆ߲ • ιʔτΞϧΰϦζϜͷԼݶO(nlogn)Ͱ͋Δ ͱҰൠతʹݴΘΕ͍ͯΔ. • ಉ͡ΞϧΰϦζϜͰ, ΫΠοΫιʔτ͕࠷ ૣ͍ͱݴΘΕ͍ͯΔ. ‣ ΩϟογϡʹࡌΓ͍ͨ͢Ί.
࠷దԽͷҙ • ࠷దԽΛߦͳ͏ࡍʹ, ఆഒΛվળ͢ΔΑΓ ΞϧΰϦζϜͦͷͷΛมߋ͠ܭࢉྔΛݮΒͤ ΔͷͳΒͦΕ͕ྑ͍. • O(n^2)ͷఆഒΛվળ͢ΔΑΓ, O(nlogn) ͷΞϧΰϦζϜ͕͋ΔͷͳΒͦͬͪΛ࠾༻͠
Α͏.
ΞϧΰϦζϜ׆༻ͷ࣮ࡍͷͱ͜Ζ • ཧ্ߴͳΞϧΰϦζϜΑΓ, ݹయతͳΞ ϧΰϦζϜͷ΄͏͕ૣ͍͜ͱ͋Δ. • Α͘ΒΕ͍ͯΔΞϧΰϦζϜΑΓ, φΠʔ ϒͳΞϧΰϦζϜͷ΄͏͕͍͍͜ͱ͋Δ.
αʔυύʔςΟʔͷ࣮Λར༻͢Δ • ఆ൪ͷΞϧΰϦζϜʹؔͯ͠, ୈࡾऀ͕ར༻͠ ͍͢Α͏ͳܗͰެ։͞Ε͍ͯΔ͜ͱ͕ଟ͍. • PerlͰ͍͏ͱ͜ΖͷCPAN • ͨͩ͠, ༷͕ࣗͨͪͷཧͱ͍͋ͬͯͳ͔ͬͨ
ΓΦʔόʔεϖοΫͳ߹ࣗͨͪͰ࣮͢Δ ඞཁ͋Δ͔.
Lesson 20 ͯͳμΠΞϦʔͷ ΩʔϫʔυϦϯΫ
ΩʔϫʔυϦϯΫͱʁ • ࣗಈతʹΩʔϫʔυͷϦϯΫΛੜ͢Δͭ ʢ5ճͷ༰Ͱग़͖ͯͨͷʣ Լઢ෦͕ϦϯΫ
αʔυύʔςΟʔͷ࣮Λར༻͢Δ • ೖྗ͞Εͨจষʹରͯ͠, 27ສ୯ޠೖ͍ͬͯΔࣙ ॻͱϚονϯά͢Δ. • ରՕॴΛHTMLͷΞϯΧʔλάʹஔ͖͑Δ࡞ ۀΛߦ͏. ͯͳμΠΞϦʔϒϩάͰ͢ <a
href=“…”>ͯͳμΠΞϦʔ</a><a href=“…”>ϒϩάͰ͢</a> ྫ
ॳͷ࣮ • ॳਖ਼نදݱΛͬͨnaive algorithmΛ࠾ ༻͍ͯͨ͠. ‣ ࣙॻதʹؚ·ΕΔશ୯ޠΛORͰܨ͛Δ ‣ (foo|bar|baz|hoge|fuga|…)
ਖ਼نදݱϚονϯάͰͷ • αʔϏε։࢝ޙ୯ޠ͕ͦΕ΄Ͳଟ͘ͳ ͍ͨΊ, DB͔ΒͦͷͰਖ਼نදݱΛੜ࣮͠ ݱ͍ͯͨ͠. • Ωʔϫʔυ͕૿Ճ͢Δͱਖ਼نදݱͷॲཧʹ࣌ ͕͔͔ؒͬͯ͘ΔͨΊ, ࣮༻ʹ͑ͳ͘ͳΔ.
ਖ਼نදݱϚονϯάͰͷ • 2ͭͷՕॴʹ͕͔͔࣌ؒΔ 1. ਖ਼نදݱΛίϯύΠϧ͢Δॲཧ • ࣄલʹ࡞ͬͯΩϟογϡ͓ͯ͘͠ 2. ਖ਼نදݱͰύλʔϯϚον͢Δॲཧ •
ղܾͰ͖ͣ…
ύλʔϯϚονͷ • ΩʔϫʔυϦϯΫͷੜʹ͕͔͔࣌ؒΔݪҼ , ਖ਼نදݱͷΞϧΰϦζϜ͕ݪҼ • ΦʔτϚτϯ͕༻͞Ε͍ͯΔ. • ଟ͘NFAʢඇܾఆੑ༗ݶΦʔτϚτϯʣ
ύλʔϯϚονͷ • (foo|bar|baz|…)ͱ͍͏ਖ਼نදݱ͕͋ͬͨͱ͖ foo | bar | baz | …
text : d(^_^o) ෆҰக ʁ
ύλʔϯϚονͷ • ୯७ʹઌ಄͔Βݟ͍ͯͬͯ, Ұக͢Δ͔͠ͳ͍ ͔Λݟ͍ͯ͘. ‣ ୯ޠ͕ଟ͘ͳΔ΄Ͳॲཧ͕͘ͳΔ ‣ ୯ޠ͕গͳ͚ΕͦΕͰಈ࡞͢Δ
ਖ਼نදݱˠTrie • ύλʔϯϚονͷܭࢉྔΛམͱͨ͢Ίʹਖ਼ن දݱ͔ΒTrie࣮ΛΓସ͑ͨ. • TireͬͯͳΜ͚ͩͬʁ
ʲ෮शʳTrieͱ • Ωʔू߹Λѻ͏ͨΊͷσʔλߏͷҰछ • ࠓճͷΑ͏ͳ୯ޠͷू߹ͱ͔ • ݕࡧαΠζ͕ͷେ͖͞Ͱͳ͘୯ޠͷ͞ ʹґଘ͢Δ • ऩ݅ʹґଘ͠ͳ͍
ʲ෮शʳTrieͱ t e a n o i n n w
e keys: tea, ten, to, i, in, inn, we
TrieߏͱύλʔϯϚον • TrieߏΛ͏ͱ, ਖ਼نදݱʹΑΔϚονΑΓܭࢉྔ Λݮ͢Δ͜ͱ͕Ͱ͖Δ. • ೖྗจॻΛTrieʹೖྗͯ͠ΤοδΛḷΓ, ऴ͕ݟͭ ͔Εଘࡏ͢Δ. •
ܭࢉྔͷαΠζ͕ೖྗͷ͞ʹґଘ͢Δ • ΦʔτϚτϯͬΆ͍
AC(Aho-Corasick)๏ • ࣙॻத͔ΒύλʔϯϚονϯάΛߴʹߦ͏ ख๏. ‣ ࣙॻαΠζʹґଘ͠ͳ͍ • ػձ͕͋ΕΓ͍͙ͨΒ͍ͷؾ࣋ͪ… • ຊ࣭ใͰͳͦ͞͏ͳͷͰεΩοϓ
ΩʔϫʔυϦϯΫ࣮ɺมભͱߟ • ڊେͳਖ਼نදݱˠAC๏ˠRegexp::Listͱม Խ͖ͯͨ͠. • ͲͷΑ͏ʹσʔλߏͱΞϧΰϦζϜΛબఆ ͍͖͔ͯ͘͠ʁ
ॳظஈ֊ • σʔλ͋·Γେ͖͘ͳ͘γϯϓϧͳ࣮ → γϯϓϧΏ͑গͳͯ͘ࡁΉ → ॊೈੑʹΜ࣮ͩ → ଞͷػೳͷ࣮࣌ؒΛׂ͘
ݱࡏ • σʔλ͕େ͖͘ͳΔ͜ͱͰ, େ෯ʹ࣮Λݟ ͢ඞཁ͕ग़͖ͯͨ. → γϯϓϧͳ࣮+ΩϟογϡͰճආෆՄೳ → ΞϧΰϦζϜΛධՁ͠, ܭࢉྔͷ؍͔Βߟ
͑Δඞཁ͕͋Δ
ߟ • ॳظஈ֊Ͱ, ࠷ద(Ͱ͋Δ͕ෳࡶ)ͳ࣮Λ༻ ͍Δ͜ͱ͕ਖ਼͍͠ͱݶΒͳ͍. ‣ γϯϓϧͳ࣮͕ڐ͞ΕΔ • େنʹͳͬͨͱ͖ʹඋ͑ͯ, ຊ࣭తͳղܾํ
๏Λ͓ͬͯ͘ඞཁ͋Δ.
Lesson 21 هࣄΧςΰϥΠζ
هࣄΧςΰϥΠζͱʁ • هࣄͷ༰ʹج͍ͮͯ, దͳΧςΰϦʹࣗಈ Ͱྨ͢Δ͜ͱ. • ϕΠδΞϯϑΟϧλΛͬͯΧςΰϦఆΛ ߦ͍ͬͯΔ.
ϕΠδΞϯϑΟϧλͱʁ • φΠʔϒϕΠζ(Naive Bayes)ͱ͍͏ΞϧΰϦζϜΛར༻ ͯ͠, ֬తʹͲͷδϟϯϧʹྨ͞ΕΔ͔ΛௐΔํ๏. • աڈͷྨใΛͬͯ, ݱࡏͷ(δϟϯϧ͕ະͷ)จॻ Λྨ͢Δ.
‣ ڭࢣ͋ΓֶशΛར༻͍ͯ͠Δ. ‣ ػցֶशɾύλʔϯೝࣝͷݚڀՌʹΑΔͷ
ϕΠδΞϯϑΟϧλͱʁ • ϕΠδΞϯϑΟϧλͷ֩ͱͳ͍ͬͯΔͷ, φΠʔ ϒϕΠζͱ͍͏ΞϧΰϦζϜ. ‣ ໊લ͔Β͔Δ௨Γ, ϕΠζͷఆཧΛϕʔεͱͨ͠ ΞϧΰϦζϜ. •
۩ମతͳ࣮ʹ౿Έࠐ·ͣ, ͲͷΑ͏ʹಈ͔͚ͩ͘ ղઆ͢Δ.
φΠʔϒϕΠζʹΑΔΧςΰϦਪఆ • ͋ΔจॻD͕༩͑ΒΕͨ࣌, ͦͷจॻ͕֬త ʹͲͷΧςΰϦCʹଐ͢Δͷ͕ͬͱΒ͍͠ ͔ΛٻΊΔ. • ͭ·Γ, จॻD͕༩͑ΒΕͨͱ͖ͷΧςΰϦC Ͱ͋Δ͖݅֬P(C|D)ΛٻΊΔ.
φΠʔϒϕΠζʹΑΔΧςΰϦਪఆ • P(C|D)Λܭࢉ͢Δͷ͍͠ ‣ ϕΠζͷఆཧΛ͏ͱܭࢉՄೳʹ • ϕΠζͷఆཧ P(D|C), P(C), P(D)ΛٻΊΔ.
φΠʔϒϕΠζʹΑΔΧςΰϦਪఆ • ࠓճඞཁͳͷ֬ͦͷͷΑΓ, ͲͷΧς ΰϦ͕ͬͱΒ͍͔͠ͱ͍͏͜ͱ • ઌ΄Ͳͷࣜʹ͓͍ͯ, จॻD͕ੜى͢Δ֬ P(D)ҰఆͰ͋Δ. ‣
P(D|C), P(C)ͷΈΘ͔Εྑ͍ʂ
φΠʔϒϕΠζʹΑΔΧςΰϦਪఆ • P(D|C), P(C)ͷΛֶशσʔλ͔Βࢉग़ͯ͠͠ ·͑ٻΊΔ͜ͱ͕Ͱ͖Δ. ‣ P(C)ͦͷΧςΰϦ͕ग़ݱ͢Δ֬ͳͷͰ, ྨ͞Εͨճ͚͓͚͍͍֮ͩ͑ͯ. ‣ P(D|C)ଟগͷ͕ཉ͍͠.
φΠʔϒϕΠζʹΑΔΧςΰϦਪఆ • P(D|C)Λܭࢉ͢Δ. • DΛҙͷ୯ޠW͕࿈ଓͯ͠ग़ݱ͢Δͱߟ͑ • ҎԼͷΑ͏ͳࣜͰۙࣅ͢Δ. • DΛ୯ޠʹׂ͠, ͦΕ͝ͱʹͦͷ୯ޠ͕Ͳ͜
ʹྨ͞Εͨͷ͔Λอଘ͓ͯ͘͠.
طଘͷख๏ΛҾ͖ग़͠ʹೖΕ͓ͯ͘ • େنσʔλʹର͢ΔΞϧΰϦζϜతͳΞϓϩʔ νΛֶͿʹͨͬͯ, طଘͷख๏Λ͋Δఔ ͍ͬͯΔ͜ͱ͕ॏཁ. ‣ σʔλϚΠχϯάɾύλʔϯೝࣝɾػցֶश ͷΑ͏ͳΞϧΰϦζϜ. ‣
ιʔτ୳ࡧɺѹॖ
طଘͷख๏ΛҾ͖ग़͠ʹೖΕ͓ͯ͘ • ྫɿTrieΩʔϫʔυϦϯΫʹԠ༻Ͱ͖Δ ɹɹϕΠδΞϯϑΟϧλΛࣗಈྨʹ → ͲͪΒΞϧΰϦζϜΛ͍ͬͯΔ͜ͱͰ, ׆༻ Ͱ͖Δ͜ͱ͕Θ͔Δ. • ΞϧΰϦζϜΛ࣮ޙՃ࡞ۀ͕͋Δ͜ͱͬ
͓͖ͯ͘.