$30 off During Our Annual Pro Sale. View Details »

Elasticsearch で部内 Wiki 検索高速化

nonylene
June 05, 2017

Elasticsearch で部内 Wiki 検索高速化

KMC 例会講座 資料

nonylene

June 05, 2017
Tweet

More Decks by nonylene

Other Decks in Technology

Transcript

  1. Heineken Inside
    KMC nonylene

    View Slide

  2. ࣗݾ঺հ
    • nonylene (ͷʹΕΜ)
    • KMC 4ճੜ
    • ΧʔϏΟͰ͢
    • root

    View Slide

  3. ࣗݾ঺հ
    • nonylene (ͷʹΕΜ)
    • εϚϗΞϓϦɾαʔόʔ
    • Twitter / GitHub etc
    • http://nonylene.hatenablog.jp/

    View Slide

  4. ۙگ
    • OCaml ೖ໳ͱ͔
    • ศར

    View Slide

  5. ຊ୊

    View Slide

  6. Heineken

    View Slide

  7. What is Heineken?
    • ෦಺υΩϡϝϯτߴ଎ݕࡧγεςϜ

    View Slide

  8. pukiwiki /wiki/*.txt
    Heineken-crawler
    Heineken
    σʔλొ࿥
    ݕࡧ
    Link
    ղੳ

    View Slide

  9. ͖͔͚ͬ
    • nonylene ʮͳΜ͔ KMC ͷπʔϧͰͭΒ͍ͱ
    ͋͜Δʁʯ
    • kebus ʮݕࡧʂʂʂʯ
    • nonylene ʮ͔֬ʹ…ʯ

    View Slide

  10. PukiWiki ݕࡧΊͬͪΌ஗͍
    • PukiWiki ݕࡧߴ଎Խͯ͠༏উ͢Δ͔͠ͳ͍

    View Slide

  11. ໨࣍
    1. PukiWiki ͷ࿩
    2. ߴ଎Խ͢Δʹ͸
    3. Elasticsearch ಋೖ
    4. Elasticsearch ݕࡧ
    5. Heineken ( React )
    6. ੒Ռɾײ૝

    View Slide

  12. pukiwiki /wiki/*.txt
    Heineken-crawler
    Heineken
    σʔλొ࿥
    ݕࡧ
    Link
    ղੳ

    View Slide

  13. 1. PukiWiki ֓આ

    View Slide

  14. PukiWiki ͱ͸ʁ
    • PHP ੡ͷ Wiki
    • ୭Ͱ΋ࣗ༝ʹฤूͰ͖Δ
    • ؆୯ʹઃఆͰ͖Δ
    • ๛෋ͳϓϥάΠϯ

    View Slide

  15. PukiWiki ͱ͸ʁ
    • ͭΒ͍ͱ͜Ζ
    • ݹ͍ʢ7೥΄Ͳ։ൃࢭ·͍ͬͯͨʣ
    • ڈ೥ลΓ͔Β࠶։͖͍ͯͯ͠Δ
    • ϖʔδͷσʔλ͕ϑΝΠϧʢޙड़ʣ
    • εϚϗ޲͚ UI ͸·ͩͳ͍

    View Slide

  16. • 2003/11 ຤ࠒʹಋೖ͞Εͨ໛༷
    • ݱࡏ͸ apache2 Ͱಈ͍͍ͯΔ
    PukiWiki at KMC

    View Slide

  17. • euc-jp Λ࢖͍ͬͯΔ
    • Wiki ؔ܎ͷॲཧΛ͢Δͱจࣈίʔυ͕େม
    • PukiWiki ࣗମ͸ UTF-8 ΋Մೳ
    PukiWiki at KMC ͷͭΒ͍ͱ͜

    View Slide

  18. • όʔδϣϯ͕ݹ͍
    • 1.4.8_alpha2 ( 2006 ೥ )
    • Slack ͷ౤ߘͳͲඍົʹվ଄͍ͯ͠Δ
    PukiWiki at KMC ͷͭΒ͍ͱ͜

    View Slide

  19. • PukiWiki ͷσʔλ͸શͯςΩετϑΝΠϧ
    $ ls /…/pukiwiki/wiki/
    28A1A6A1FEA1A629.txt
    31B6A6A5D0A5EAA5B1A1BCA5C9A5BFA5EFA1BCA5C7A5A3A5D5A5A7A5F
    3A5B9A5B2A1BCA5E0.txt
    323034384149A5B3A5F3A5C6A5B9A5C8.txt
    3332A5ADA5C3A5C1A5F3C0B0C8F7B7D7B2E8.txt

    PukiWiki ͷσʔλ
    ※ λΠτϧΛ euc-jp ͰΤϯίʔυͨ͠όΠτྻͷ Hex ͕ϑΝΠϧ໊ʹͳΔ

    View Slide

  20. • PukiWiki ͷσʔλ͸શͯςΩετϑΝΠϧ
    $ nkf /…/pukiwiki/wiki/BFB7B4BFA5B3A5F3A5D132303137.txt
    [[৽׻ίϯύ]]
    ~&size(25){ͲΜͲΜࢀՃ͍ͯ͜͠͏ͳ};
    *໨࣍ [#jfaa7b62]
    #contents
    ~৽ೖੜͷ͔ͨ͸΋ͪΖΜɺ্ճੜ΍OBͷํʑ΋Ұॹʹָ͠Έ·͠ΐ͏ʂ

    ※ euc-jp ͳͷͰ nkf Ͱม׵͍ͯ͠Δ
    PukiWiki ͷσʔλ

    View Slide

  21. KMC PukiWiki ͷ

    ݕࡧ͸ ͳͥ஗͍ʁ

    View Slide

  22. PukiWiki ͷݕࡧ͸ͳͥ஗͍
    • /…/pukiwiki/ ͕ NAS (ϑΝΠϧαʔόʔ) ্
    • ωοτϫʔΫӽ͠ͳͷͰ஗͍
    • Ωϟογϡʹ৐Ε͹ૣ͘ͳΔͷͰೋճ໨͸ૣ͍

    View Slide

  23. PukiWiki ͷݕࡧ͸ͳͥ஗͍
    • PukiWiki ͕ݕࡧ͢Δ࣌…
    • PHP Ͱຖճશͯͷ Wiki ϑΝΠϧͷ಺༰Λऔಘ
    • lib/func.php ͷ do_search ࢀর
    • औಘͨ͠จࣈྻʹରͯ͠શจݕࡧ

    View Slide

  24. PukiWiki ΛϓϩϑΝΠϦϯά
    • ϓϩϑΝΠϦϯάͯ͠Έͨ
    • xdebug ʢPHP ͷϓϩϑΝΠϥʣ
    • webgrind (݁Ռදࣔ͢Δ΍ͭ)
    • SVG ʹ݁ՌΛग़ྗͰ͖Δ

    View Slide

  25. • ϑΝΠϧΩϟογϡ͕ͳ͍࣌
    • 24.54 sec
    • php::fread Ͱ 62% ΋औ͍ͬͯΔ

    View Slide

  26. • ϑΝΠϧΩϟογϡ͕͋Δ࣌ʢೋճ໨ʣ
    • 9.24 sec
    • php::fopen ΍ flock ͕ࢧ഑తʹ

    View Slide

  27. PukiWiki ͷݕࡧ͕஗͍ݪҼ
    • read ͕஗͍
    • flock ͕஗͍
    • open ͕஗͍

    View Slide

  28. PukiWiki ͷݕࡧ͕஗͍ݪҼ
    • ͱʹ͔͘ϑΝΠϧಡΉͷ͕஗͍
    • PukiWiki ͸ѱ͘ͳ͍ʂʂ
    • จࣈྻݕࡧʹ͔͔ͬͯΔͷ͸ 2 sec ͙Β͍

    View Slide

  29. pukiwiki /wiki/*.txt
    Heineken-crawler
    Heineken
    σʔλొ࿥
    ݕࡧ
    Link
    ղੳ

    View Slide

  30. 2. ߴ଎Խ

    View Slide

  31. ݕࡧߴ଎Խͷखஈ
    • ϑΝΠϧΞΫηεૣ͍ͱ͜Ζʹஔ͘
    • ผͷϚγϯͱ͔
    • ϝϞϦʹࡌͤΔ
    • MySQL ͱ͔

    View Slide

  32. ݕࡧߴ଎Խͷखஈ
    • ϑΝΠϧΞΫηεૣ͍ͱ͜Ζʹஔ͘
    • ࠓߟ͑ͨΒ͜Ε͕ࡶͰָͰ͋Δ
    • MySQL ͱ͔࢖͏
    • ಠࣗʹͳΜ͔࡞Δ͜ͱʹͳΔ
    • ͦΕͳΒ΋ͬͱݕࡧʹ࠷దԽ͍ͨ͠Ͷ… →

    View Slide

  33. View Slide

  34. Elasticsearch
    • ෼ࢄܕ RESTful ݕࡧɾ෼ੳΤϯδϯ
    • Φϥϯμͷ elastic ͕ࣾ։ൃ
    • Φʔϓϯιʔε
    • Java

    View Slide

  35. Elasticsearch
    • ௒ૣ͍
    • > Elasticsearch͸଎͍ɻͱʹ͔͘଎͍ɻ
    • શจݕࡧɾϩάղੳ౳
    • GitHub / Facebook etc…
    https://www.elastic.co/jp/products/elasticsearch

    View Slide

  36. Elasticsearch Ͱ଎͍ݕࡧ
    1. Elasticseach ʹ PukiWiki ͷσʔλΛೖΕΔ
    2. ͍͍ײ͡ʹղੳ͞ΕΔ
    3. ௒ૣ݁͘Ռ͕ग़Δ
    ✨༏উ✨

    View Slide

  37. Elasticsearch Ͱ଎͍ݕࡧ
    • ·͋ཁ͢Δʹ Elasticsearch ࢖͍͔ͨͬͨ
    • MySQL Ͱ΋े෼଎͍ͱࢥ͏

    • ͜ͷݕࡧΛ࢖ͬͨΞϓϦΛ࡞Δ

    View Slide

  38. Elasticsearch Ͱ଎͍ݕࡧ
    • ͔ͤͬ͘ͳͷͰ໊લΛ͚ͭΑ͏
    • ଎͍ݕࡧ
    → Hayaken
    →Heineken

    View Slide

  39. pukiwiki /wiki/*.txt
    Heineken-crawler
    Heineken
    σʔλొ࿥
    ݕࡧ
    Link
    ղੳ

    View Slide

  40. 3. Elasticsearch ಋೖ

    View Slide

  41. 3-1. Elasticsearch ͷղੳ

    View Slide

  42. Elasticseach Ͱղੳ
    • จষͷղੳͷ࢓ํΛܾఆ͢Δඞཁ͕͋Δ

    View Slide

  43. Elasticseach Ͱղੳ
    • Analyzer ͷߏ੒
    • Char Filter Ͱਖ਼نԽ౳
    • Tokenizer Ͱ෼ׂʢ͕͜͜େࣄʣ
    • Token Filter Ͱਖ਼نԽ౳

    View Slide

  44. Char Filter
    • จࣈؒͷҧ͍Λٵऩ͢Δ
    • ͂ʢ̴͉̺̰̺̽̈́ʣ → s (hankaku)
    • ᶨ → ϔΫλʔϧ
    • Tokenizer ʹೖΕΔલʹෆཁͳจࣈΛআ͘

    View Slide

  45. Char Filter
    • ICU Analysis Plugin
    • ެࣜϓϥάΠϯ
    • ྑ͍ײ͡ʹਖ਼نԽͯ͘͠ΕΔ
    • શ֯ˠ൒֯ɺه߸෼ղɺେจࣈˠখจࣈ౳
    • ౉ᬒˠ౉ลͱ͔͸΍Βͳ͍

    View Slide

  46. Char Filter
    • HTML Strip Char Filter
    • ࠷ॳ͔Β͋ΔϑΟϧλ
    • HTML λάΛআ͍ͯ͘ΕΔ
    • ͱ͔<br/>

    View Slide

  47. Tokenizer
    • จষΛ͍͍ײ͡ʹ۠੾Δ
    • ۠੾ͬͨޠ۟ͷҐஔΛه࿥ ( Index ) ͢Δ
    → ݕࡧޠ͕۟͋Δ৔ॴ͕͙͢෼͔Δ
    → શจݕࡧΑΓ΋଎͍ʂ

    View Slide

  48. • ྫ: εϖʔε۠੾Γ
    • “͜Μʹͪ͸ KMC Hello!”
    → Index: [(‘͜Μʹͪ͸’, 0), (‘KMC’, 6), 

    (‘Hello!’, 10)]
    Tokenizer

    View Slide

  49. • Index: [(‘͜Μʹͪ͸’, 0), (‘KMC’, 

    6), (‘Hello!’, 10)]

    • ‘KMC’ Ͱݕࡧ
    → ʮ 6 ൪໨ʹ͋Δʯͱ͙͢෼͔Δ
    Tokenizer

    View Slide

  50. • Index: [(‘͜Μʹͪ͸’, 0), (‘KMC’, 

    6), (‘Hello!’, 10)]

    • ‘KM’ Ͱݕࡧ
    → ʮͦΜͳ΋ͷ͸ͳ͍ʯ
    Tokenizer

    View Slide

  51. • ೔ຊޠ͸େม
    • εϖʔεͰ۠੾Εͳ͍ɾෳࡶͳจ๏
    • จষͷ۠੾ΓΛ໌֬ʹ͠ʹ͍͘

    • ྫ:ʮ͜Μʹͪ͸͜Μʹͪ͸ʯ → ??????????
    Tokenizer

    View Slide

  52. • ํ๏ᶃ n จࣈ͝ͱʹ۠੾Δ ( N-Gram )

    • “͜Μʹͪ͸ɺࠓ೔΋͍͍ఱؾͰ͢Ͷ”
    → Index: [(‘͜Μ’, 0), (‘Μʹ’, 1), (‘ʹͪ’, 

    2), (‘ͪ͸’,3), … , (‘͢Ͷ’, 14)]
    Tokenizer

    View Slide

  53. Tokenizer
    • Index: [(‘͜Μ’, 0), (‘Μʹ’, 1), (‘ʹͪ’,
    2), (‘ͪ͸’,3), … , (‘͢Ͷ’, 14)]

    • ‘͜Μʹͪ’ Ͱݕࡧ
    → ‘͜Μ’ ͕ 0 ൪໨ʹώοτ → ͦͷޙ΋ਖ਼ͦ͠͏
    → ʮ 0 ൪໨ʹ͋Δʯͱ͙͢෼͔Δ

    View Slide

  54. • ํ๏ᶃ n จࣈ͝ͱʹه࿥͢Δ ( N-Gram )
    • ར఺
    • ඞͣώοτ͢ΔʢऔΓ͜΅͕͠ͳ͍ʣ
    • ۠੾ͬͨจࣈҎ্ͷ৔߹ʹݶΔ
    • ؆୯
    Tokenizer

    View Slide

  55. • ํ๏ᶃ n จࣈ͝ͱʹه࿥͢Δ ( N-Gram )
    • ܽ఺
    • Index ͕ංେԽ͠΍͍͢
    • ෆཁͳ΋ͷʹϚον͠΍͍͢
    • ྫ: ’͍ఱ’ → ‘͍͍ఱؾ’
    Tokenizer

    View Slide

  56. • ํ๏ᶄ จষΛղੳͯ͠෼ׂ ʢܗଶૉղੳʣ
    Tokenizer
    Tokenized by kuromoji: http://www.atilika.org/

    View Slide

  57. • ํ๏ᶄ จষΛղੳͯ͠෼ׂ ʢܗଶૉղੳʣ
    • ݱࡏͰ΋֤ॴͰݚڀ͞Ε͍ͯΔ
    • kuromoji ͕༗໊ http://www.atilika.org/
    • mecab ͱ͔
    Tokenizer

    View Slide

  58. • ํ๏ᶄ จষΛղੳͯ͠෼ׂ ʢܗଶૉղੳʣ
    • ར఺
    • Index ͕ංେԽ͠ʹ͍͘
    • ϊΠζ͕গͳ͍
    Tokenizer

    View Slide

  59. • ํ๏ᶄ จষΛղੳͯ͠෼ׂ ʢܗଶૉղੳʣ
    • ܽ఺
    • औΓ͜΅͕͠ଟ͍
    • ଟݴޠରԠ͠ʹ͍͘
    Tokenizer

    View Slide

  60. • ํ๏ᶄ จষΛղੳͯ͠෼ׂ ʢܗଶૉղੳʣ
    • ࡶͳݴ༿ͩͱղੳͮ͠Β͍
    • ྫ
    • ‘Ί͏Ί͏Ί͏Ί͏’
    Tokenizer
    Tokenized by kuromoji: http://www.atilika.org/

    View Slide

  61. • Heineken Ͱ͸…
    • 2-Gram Λ࠾༻
    • ݕࡧ࿙Εͨ͘͠ͳ͍
    • ه߸΍Ṗͷݴ༿͕ଟ͍ΆΑʙʙ
    • ͦΜͳʹྔ͕ଟ͘ͳ͍
    Tokenizer

    View Slide

  62. Token Filter
    • Tokenize ޙͷޠ۟ʹର͔͚ͯ͠ΔϑΟϧλ
    • ྨޠɾ-ed / -s ͷ౷ҰͳͲ
    • ͜͜ͰେจࣈখจࣈΛἧ͑Δ৔߹΋

    • Heineken Ͱ͸࢖͍ͬͯͳ͍

    View Slide

  63. 3-2. Elasticsearch ઃఆ

    View Slide

  64. Elasticseach ͷσʔλߏ଄
    Cluster: KMC
    Index: hoge Index: piyo
    Type: page Type: relation Type: item Typ
    Field: title
    Field: body
    Field: modified
    Field: from
    Field: to
    Field: price
    Field: name
    Field: desc
    Field: available
    Fiel
    Fiel

    View Slide

  65. • Index ͱ Type Λఆٛ
    • Type ʹ Field Λઃఆ
    • Analyzer / Datatype ͳͲ
    • Index ʹ Type Λઃఆ

    (mapping)
    Elasticseach ͷσʔλߏ଄
    Index: pukiwiki
    Type: page
    Field: title
    Field: body
    Field: modified

    View Slide

  66. • Analyzer ఆٛ
    {
    "settings": {
    "analysis": {
    "analyzer": {
    "jp_analyzer": {
    "tokenizer": "jp_tokenizer",
    "char_filter":

    [ "html_strip", “icu_normalizer" ], …
    }
    },
    "tokenizer": {
    "jp_tokenizer": {
    "type": “ngram", … ,
    "token_chars":

    [ "letter", "digit", "symbol", "punctuation" ]
    }}}},

    }

    View Slide

  67. • Mapping ఆٛ
    {
    … ,
    "mappings": {
    "page": {

    "properties": {
    "title": { … },
    "title_url_encoded": { … },
    "body": {
    "type": "text",
    "analyzer": "jp_analyzer",
    "term_vector" : "with_positions_offsets"
    },
    "modified": {
    "type": "date",
    "format": "strict_date_optional_time||epoch_millis"
    }
    }}}}
    Index: pukiwiki
    Type: page
    Field: title
    Field: body
    Field: modified
    Field:
    title_url_encoded

    View Slide

  68. • Elasticsearch is RESTful
    • جຊతʹશͯ JSON Ͱ΍ΓऔΓ͢Δ
    • Elasticsearch Λىಈ͢Δͱ Web αʔόʔཱ͕ͭ
    • ͦ͜ʹ Python ౳Ͱ JSON ͷ

    Index ఆٛΛ౤͛ͯઃఆ
    Elasticseach ͷઃఆ

    View Slide

  69. 3-3. 

    Elasticsearch ͷαʔόʔ

    View Slide

  70. Elasticseach Clusterʢࢀߟʣ
    Cluster: KMC
    Node: foo Node: bar
    Replica: hoge3
    Shard: hoge1 Replica: piyo1
    Replica: piyo2 Shard: hoge3
    Replica: hoge1 Sha
    Rep

    View Slide

  71. Elasticseach ಋೖ
    • Elasticsearch ͸ෛՙ͕େ͖͍
    • ϝϞϦ୔ࢁ৯͏
    • ϑΝΠϧΞΫηε΋୔ࢁ͢Δ
    • ͱ͍͏͜ͱͰ৽͍͠෺ཧϚγϯΛஔ͘͜ͱʹ
    • ࠓճ node ͸ 1 ୆ͷΈ

    View Slide

  72. pukiwiki /wiki/*.txt
    Heineken-crawler
    Heineken
    σʔλొ࿥
    ݕࡧ
    Link
    ղੳ

    View Slide

  73. 3-4. 

    Heineken-crawler

    View Slide

  74. Heineken-crawler
    • Python3 Ͱॻ͔Εͨ PukiWiki ͷΫϩʔϥ
    • จࣈίʔυΛ UTF-8 ʹม׵
    • λΠτϧऔಘɾม׵
    • PukiWiki σʔλΛ Elasticsearch ʹ౤͛Δ

    View Slide

  75. Heineken-crawler
    • ๭ cron Ͱ 10 ෼ʹҰճಈ͍͍ͯΔ
    • ߋ৽͞Εͨϖʔδͱ࡟আ͞ΕͨϖʔδͷΈಉظ

    View Slide

  76. pukiwiki /wiki/*.txt
    Heineken-crawler
    Heineken
    σʔλొ࿥
    ݕࡧ
    Link
    ղੳ

    View Slide

  77. 4. Elasticsearch ݕࡧ

    View Slide

  78. Elasticsearch ݕࡧ
    • Elasticsearch ͸ଟछଟ༷ͳݕࡧ͕Մೳ
    • جຊతʹ֤ର৅ͷ ‘είΞ’ Λܭࢉ
    • είΞॱʹฒ΂Δ
    • είΞͷௐ੔͕େࣄ

    View Slide

  79. Heineken ͰͷείΞ
    • ᶃ λΠτϧʹॏΈΛஔ͘
    • λΠτϧͷείΞΛ 5 ഒॏཁʹ͢Δ
    • ಺෦Ͱ͍͍ײ͡ʹͳΔΒ͍͠
    • ["title^5", "body"]

    View Slide

  80. Heineken ͰͷείΞ
    • ᶄ ߋ৽೔࣌ʹॏΈΛஔ͘
    • ݱࡏ͔Βԕ͘ͳΔ΄Ͳ exp Ͱ৐ࢉ
    https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

    View Slide

  81. Heineken ͰͷείΞ
    • ᶄ ߋ৽೔࣌ʹॏΈΛஔ͘
    • ௚ײͰௐ੔
    • origin -> ݱࡏ
    • offset -> 150೔
    • scale -> 500೔
    • decay -> 0.75
    https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

    View Slide

  82. Heineken ͰͷείΞ
    • ᶅ λΠτϧͷ୹͞ʹॏΈΛஔ͘
    • Wiki ͷλΠτϧ͸֊૚ߏ଄
    • ྫ: ‘NF2017/४උ’
    • ୹͍΄͏͕ॏཁͳ܏޲ʹ͋Δ

    View Slide

  83. Heineken ͰͷείΞ
    • ᶅ λΠτϧͷ୹͞ʹॏΈΛஔ͘
    • ࡉ͔͘ௐ੔͠ͳ͕ΒϕετΛ୳Δ
    • sqrt ͱ͔ log ͱ͔
    • ݁ہ࠾༻ͨ͠ͷ͸…

    View Slide

  84. Heineken ͰͷείΞ
    • ᶅ λΠτϧͷ୹͞ʹॏΈΛஔ͘
    • "_score /
    Math.sqrt(Math.log1p(doc[‘title.keyword'].
    value.length()))"
    score

    1
    p
    log
    (
    title.length
    + 1)

    View Slide

  85. Heineken ͰͷείΞ
    • ᶅ λΠτϧͷ୹͞ʹॏΈΛஔ͘
    • + 1 ʹ͍ͯ͠Δͷ͸ 1 จࣈͷ࣌ରࡦ
    • log ͸ͳΜͱͳ͘
    • sqrt ͸ͳΜͱͳ͘
    score

    1
    p
    log
    (
    title.length
    + 1)

    View Slide

  86. Heineken ͰͷείΞ
    • ᶅ λΠτϧͷ୹͞ʹॏΈΛஔ͘

    View Slide

  87. Heineken ͰͷείΞ
    • είΞ࠷ڧ͸ NF ʹͳͬͨ

    View Slide

  88. ଞͷػೳ
    • ݕࡧޠ͔۟Β͍͍ײ͡ͷ৔ॴΛநग़Ͱ͖Δ
    • ϋΠϥΠτ༻ͷ HTML λάૠೖ΋Ͱ͖Δ
    "fields": {
    "body": {
    "pre_tags": [""],
    "post_tags": [""],
    "fragment_size": 220, …,
    }
    }

    View Slide

  89. pukiwiki /wiki/*.txt
    Heineken-crawler
    Heineken
    σʔλొ࿥
    ݕࡧ
    Link
    ղੳ

    View Slide

  90. 5. Heineken ( React )

    View Slide

  91. Heineken ΞϓϦ࣮૷
    • Elasticsearch ͸ RESTful
    • શ෦ JSON Ͱฦͬͯ͘Δ
    → Elasticsearch Ҏ֎ʹσʔλ͸ෆཁ
    → શ෦ JavaScript Ͱ΍Ε͹ྑ͍

    View Slide

  92. Heineken ΞϓϦ࣮૷
    • Rails ͳͲͷαʔόʔͰಈ͘ΞϓϦ͸࢖Θͳ͍
    • Nginx ͸੩తϑΝΠϧΛฦ͚ͩ͢
    • ฦ͢ HTML ΋ JS ͱ CSS ಡΈࠐΉ͚ͩ

    View Slide

  93. React

    View Slide

  94. React ུ֓
    • UI Λߏ੒͢ΔͨΊͷ JS ϥΠϒϥϦ
    • Facebook ੡
    • ֤ॴͰ࢖ΘΕ͍ͯΔ
    • UI ͷ֤෦඼Λ Component ͱͯ͠ߏ੒͍ͯ͘͠

    View Slide

  95. React ུ֓ - Virtual DOM
    • Virtual DOM ͰԾ૝తʹ DOM Λอ࣋
    • σʔλͷมߋ࣌ʹ͸ Virtual DOM Λมߋ
    • ͦͷޙ࣮ࡍͷ DOM ͱͷࠩ෼Λ൓ө
    • DOM ͷมߋΛ཈͑ΒΕͯޮ཰త
    ৄࡉ: http://qiita.com/mizchi/items/4d25bc26def1719d52e6

    View Slide

  96. React ུ֓ - JSX
    • JSX Ͱ JS ্ʹ HTML Λॻ͚Δ
    • ςϯϓϨʔτΤϯδϯͬΆ͘ॻ͚ͯศར
    const element = (

    Hello, {username}!

    );

    View Slide

  97. React ։ൃ؀ڥ
    • create-react-app
    • Facebook ੡ͷ؆୯ React ߏ੒πʔϧ
    • ։ൃ؀ڥ
    • Ϗϧυ؀ڥ
    • ςετ؀ڥ
    https://github.com/facebookincubator/create-react-app

    View Slide

  98. React ։ൃ؀ڥ
    • ES6 ( ECMA Script 6 )
    • JavaScript ͷ৽͍͠ඪ४
    • class / Arrow function / const / Promise etc..
    ৄࡉ: https://www.slideshare.net/1000ch/begin-ecmascript6

    View Slide

  99. React ։ൃ؀ڥ
    • Babel
    • ES6 Λ ES5 ʹม׵ͯ͘͠ΕΔϥΠϒϥϦ
    • ES5 ͸ଟ͘ͷϒϥ΢βͰ࢖͑ΔͷͰ҆৺

    View Slide

  100. ͦͷଞͷύʔπ
    • React Router v4
    • JavaScript ͰϧʔςΟϯά
    • URL ʹΑͬͯ Component Λग़͠෼͚Δ

    View Slide

  101. ͦͷଞͷύʔπ
    • Bootstrap
    • Twitter ࣾ੡ CSS / JS 

    ϑϨʔϜϫʔΫ
    • ෦һ໊฽ɾ

    ਆֆΞοϓϩʔμ ͳͲ

    View Slide

  102. Heineken ։ൃ
    1. create-react-app Ͱͻͳܗ࡞੒
    2. ྑ͍ײ͡ʹ Component ࡞Δ
    • Elasticsearch ͷ API Λୟ͍ͯ൓өͤ͞Δ
    3. BabelɾWebpack ͰίϯύΠϧ
    4. αʔόʔʹஔ͘

    View Slide

  103. Heineken ։ൃ
    • ໘౗ͩͬͨͱ͜Ζ
    • ϖʔδϟ
    • ஸೡʹ෼ذ ↓

    View Slide

  104. Heineken ։ൃ
    • ໘౗ͩͬͨͱ͜Ζ
    • λΠτϧิ׬
    • ຖ౓ API Λଧͭ

    View Slide

  105. 6. ੒Ռɾײ૝

    View Slide

  106. ౰ࣾൺ 1 / 500
    • 25000 ms -> 50 ms
    0
    7500
    15000
    22500
    30000
    PukiWiki Heineken
    50
    25,000

    View Slide

  107. ଞʹ΋ศར
    • ॊೈʹݕࡧͰ͖ΔΑ͏ʹͳͬͨ
    • ॱংͱ͔
    • ߋ৽೔࣌ߜࠐͱ͔
    • λΠτϧͷΈͰݕࡧͱ͔

    View Slide

  108. ༧ఆ
    • ϝʔϧରԠ
    • SPAM ରԠΛߟ͑ͳ͍ͱ͍͚ͳ͍
    • rubwiki ͱ͔

    View Slide

  109. ײ૝
    • Elasticsearch ͍͢͝
    • ͱʹ͔͍͍͘ײ͡ʹͳΔ
    • React ศརɾES6 ָ͍͠
    • αʔόʔͰԿ΋ಈ͔͞ͳ͍ͷ͸ָ
    • ΍Γ͔ͨͬͨ͜ͱ͕ग़དྷͨͷͰྑ͔ͬͨ

    View Slide

  110. Thank you for listening!

    View Slide