Upgrade to Pro — share decks privately, control downloads, hide ads and more …

minne での検索改善の歴史

shiro16
October 17, 2018

minne での検索改善の歴史

minne での検索改善の歴史を紹介

- 構成のお話
- solr から Elasticsearch 移行前後
- Elasticsearch 移行後の構成変更
- 検索について
- 計測のはなし
- A/B テストの話
- suggester の話
- 辞書管理の話

shiro16

October 17, 2018
Tweet

More Decks by shiro16

Other Decks in Technology

Transcript

 1. ޙ౻རത(.01FQBCP *OD
  ୈճ&MBTUJDTFBSDIษڧձʮݕࡧฤʯ
  NJOOFͰͷݕࡧվળͷྺ࢙

  View Slide

 2. νʔϑςΫχΧϧϦʔυ
  5PTIJIJSP([email protected]
  NJOOFࣄۀ෦!(.0ϖύϘ

  View Slide

 3. ·ͣ͸NJOOFͷ঺հ

  View Slide

 4. NJOOF

  View Slide

 5. NJOOFͷ঺հ
  wࠃ಺࠷େͷϋϯυϝΠυϚʔέοτ
  wѻ͏ΧςΰϦʔ͸ΞΫηαϦʔ͔ΒՈ۩΍৯඼·Ͱ༷ʑ
  wαʔϏε։࢝
  wࠓ݄
  ʹΞϓϦ%-਺ສಥഁ
  wੈా୩ਆށ෱ԬʹͯNJOOFͷΞτϦΤͱݺ͹ΕΔ࡞ՈࢧԉεϖʔεΛӡӦ
  wϦΞϧΠϕϯτ΋ଟ਺։࠵

  View Slide

 6. ϦΞϧΠϕϯτͷ༷ࢠ

  View Slide

 7. ࠓ೔ͷ಺༰
  wߏ੒ͷ͓࿩
  wTPMS͔Β&MBTUJDTFBSDIҠߦલޙ
  w&MBTUJDTFBSDIҠߦޙͷߏ੒มߋ
  wݕࡧʹ͍ͭͯ
  wܭଌͷ͸ͳ͠
  w"#ςετͷ࿩
  wTVHHFTUFSͷ࿩
  wࣙॻ؅ཧͷ࿩

  View Slide

 8. ·ͣ͸ߏ੒ͷ࿩

  View Slide

 9. ˞࡞඼σʔλ͕ߋ৽͞Εͨࡍͷॲཧͷ؆қਤ
  Ҏ߱
  ·Ͱ

  View Slide

 10. wݕࡧΤϯδϯ͸4PMS
  wσʔλͷ൓ө͸ఆظCBUDI
  w࡞඼ͷݕࡧͷΈ͕ଘࡏ
  wݕࡧΤϯδϯ͸&MBTUJDTFBSDI
  wσʔλͷ൓ө͸ߋ৽࣌ʹඇಉظ
  ॲཧ
  wҠߦ࣌͸࡞඼ݕࡧͷΈ͕ͩଓʑ
  ૿͑Δ
  ·Ͱ Ҏ߱

  View Slide

 11. σʔλ൓өͷҧ͍͸ݕࡧΤϯδϯͷҧ͍Ͱ
  ͸ͳ͘ҠߦλΠϛϯάͰྑ͖ΞʔΩςΫνϟ
  Λબ୒͚ͨͩ͠

  View Slide

 12. NJOOFY&MBTUJDTFSDIͷྺ࢙

  View Slide

 13. NJOOFY&MBTUJDTFBSDIୈظ
  wͷ4PMS͔ΒҠߦͨ͠ࡍ͸'PVOE ݱ&MBTUJD$MPVE
  Λ࢖༻
  wࣾ಺Ͱӡ༻ͷϊ΢ϋ΢΋ͳ͔ͬͨ
  wͱΓ͋͑ͣҠߦεϐʔυΛ༏ઌ͔ͨͬͨ͠
  w࢖༻͍ͯ͘͠͏ͪʹ໰୊͕ൃੜͨ͠
  wεέʔϧΞοϓ͕େม
  wࣙॻ͕࢖͑ͳ͍ ಋೖ౰ॳ͔ΒΘ͔ͬͯ͸͍ͨ

  wQMVHJOͷ௥Ճ ಋೖ౰ॳ͔ΒΘ͔ͬͯ͸͍ͨ

  View Slide

 14. ౰࣌'PVOEͰ໰୊ʹͳ͍ͬͯͨ͜ͱ͸

  &MBTUJD$MPVEͰ͸ղফ͞Ε͍ͯΔ͸ͣʂʁ

  View Slide

 15. NJOOFY&MBTUJDTFBSDIୈظ
  w"84&$ʹࣗલͰΫϥελΛߏங
  wͳͥ"NB[PO&MBTUJDTFBSDI4FSWJDFΛબ୒͠ͳ͔ͬͨʁ
  w౰࣌͸&MBTUJDTFBSDIͷWFSTJPO͕ݹ͔ͬͨ ࠷৽͕࣌୅ʹ

  wࣙॻΛ࢖͑ͳ͍໰୊͸ղܾ͠ͳ͔ͬͨ
  wݱࡏͷ&MBTUJDTFBSDIͷWFSTJPO͸Y

  View Slide

 16. ͜͜·Ͱͷ·ͱΊ

  View Slide

 17. ͜͜·Ͱͷ·ͱΊ
  wNJOOF͸4PMS͔Β&MBTUJDTFBSDIʹҠߦ͠·ͨ͠
  wॳظ͸&MBTUJD$MPVE
  wޙʹ"84&$ʹΫϥελΛߏங
  w&MBTUJDTFBSDIΛબΜͩཧ༝͸

  View Slide

 18. ͔͜͜Βݕࡧͷ࿩

  View Slide

 19. ݕࡧ݁Ռͷྑ͠ѱ͠ͱ͸

  View Slide

 20. ݕࡧ݁Ռͷྑ͠ѱ͠
  w࠶ݱ཰ ద߹཰͔Βܭࢉ͢Δ͜ͱ͸Մೳ
  wຊ౰ʹϢʔβ͕ٻΊΔݕࡧ͔ʁ͸දࣔ͢Δॱ൪΋Өڹ͢Δ
  wͳΜͱͳ͘ྑͦ͞͏Ͱݕࡧ݁ՌΛ͍͍ͬͯ͘͡ͱ
  w࣮͸Ϣʔβ͔ΒݟͨΒΰϛΈ͍ͨͳݕࡧ݁Ռʹ੒௕͍ͯ͠ΔՄೳੑ΋
  wϢʔβ͕ٻΊ͍ͯͨ΋ͷ͕࠷ޙͷϖʔδʹ͋Γશ͘ݟ͔ͭΒͳ͍౳ͷ໰୊͕

  View Slide

 21. Ϣʔβ͕ٻΊ͍ͯͨ΋ͷΛݟ͚ͭΒΕ͔ͨʁ
  Λ஌Δʹ͸ʁ

  View Slide

 22. ܭଌ͢Δ͔͠ͳ͍

  View Slide

 23. NJOOFͰ͸ߦಈϩάΛ࢖ͬͯܭଌ

  View Slide

 24. ϩάΛ஝ੵ͢Δਤ

  View Slide

 25. ͜͜ʹݟग़͕͠ೖΓ·͢
  wϢʔβͷߦಈϩάΛ5%౳ʹ஝ੵ͢Δ
  wߦಈϩάʹ͸Ͳ͏͍͏৚݅Ͱݕࡧ͞Ε͔ͨʁͱ͍͏৘ใͳͲ΋ؚ·ΕΔ
  w஝ੵ͞ΕͨߦಈϩάΛूܭ͠$53౳ͷࢦඪΛSFEBTI౳Ͱݟ͑ΔԽ
  wݕࡧ݁ՌΛม͑ͨ͜ͱʹΑͬͯ$53͕Ͳ͏มԽ͔ͨ͠ʁΛ௥͏

  View Slide

 26. ܭଌ४උ͕Ͱ͖ͨͷͰ"#ςετ

  View Slide

 27. ݕࡧ݁Ռͷ"#ςετ
  wϢʔβΛ"#ʹ෼ྨ
  wϥϯμϜͰৼΓ෼͚ͯDPPLJFʹ৘ใΛ৯ΘͤΔ౳
  wϢʔβͷߦಈϩάʹࣗ෼͕"#ͲͪΒͳͷ͔ʁͱ͍͏৘ใΛ෇༩͢Δ
  w"#ͦΕͧΕͷࢦඪΛݟ͑ΔԽ͠ͲͪΒ͕ྑ͍͔ʁΛൺֱ͢Δ
  wࢦඪ͕ߴ͍ํΛ࠾༻࣍͠ͷ"#Λߦ͏
  w̋̋ԁҎ্ߪೖͰ̋̋ΩϟϯϖʔϯͳͲ͕͋Δͱࢦඪ͕ϒϨΔ͜ͱ͕͋ΔͷͰ
  ஫ҙ͕ඞཁ

  View Slide

 28. ݕࡧͷߦಈϩάΛ஝ੵ͢Δͱ
  ଞʹ΋ྑ͍͜ͱ͕

  View Slide

 29. ϩάΛ஝ੵ͢Δ͜ͱͷϝϦοτ
  wࡢ೥ͷϩάΛݟͯτϨϯυΛ஌Δ͜ͱ͕Ͱ͖Δ
  wNJOOFͷΑ͏ͳ഑ૹʹ͕͔͔࣌ؒΔαʔϏεͩͱ࣮ࡍͷτϨϯυͱඍົʹͣ
  Ε͍ͯͨΓ͢Δ
  wޙड़ͷΦʔτίϯϓϦʔτར༻Մೳ
  wFUD

  View Slide

 30. NJOOFͰ͸ͲΜͳ"#ςετΛ
  ͍ͯ͠Δ ͍ͯͨ͠
  ͔ʁ

  View Slide

 31. 'VODUJPO4DPSF2VFSZΛ࢖͍είΞΛมߋ
  wߪೖ͞Εͨճ਺͕ଟ͍
  w͓ؾʹೖΓͷ਺͕ଟ͍
  wಛఆͷ୯ޠ قઅతͳ
  ؚ͕·Ε͍ͯͨΒείΞΛ্Լ
  wϢʔβͷଐੑ͝ͱʹείΞΛมߋ
  wΧςΰϦʔ͝ͱʹείΞ্͕ঢ͢Δ৚݅Λมߋ
  wFUD

  View Slide

 32. GET /_search # likes ͷ஋ΛείΞʹ൓ө͢Δ
  {
  "query": {
  "function_score": {
  "field_value_factor": {
  "field": "likes",
  "factor": 1.2,
  "modifier": "sqrt"
  }
  }
  }
  }

  View Slide

 33. "#ςετ͸͏·͍͔͘ͳ͍͜ͱ΋ଟ͍

  View Slide

 34. ͏·͍͔͘ͳ͔ͬͨͱ͍͏
  ஌͕ࣝ஝ੵ͞ΕΕ͹ྑ͍

  View Slide

 35. $PNQMFUJPO4VHHFTUFSͷ͓࿩

  View Slide

 36. View Slide

 37. $PNQMFUJPO4VHHFTUFSͱ͸ʁ
  w(PPHMFͰݕࡧ͢Δ࣌ʹ్த·Ͱೖྗ͢Δͱީิ͕ग़ͯ͘ΔΞϨ Φʔτίϯϓ
  Ϧʔτͱݺ͹ΕΔ΋ͷ

  w&4·Ͱ͸ઐ༻ͷFOEQPJOU @TVHHFTU
  ͕ଘࡏ͍ͯͨ͠
  w&4͔Β͸௨ৗͷݕࡧ @TFBSDI
  ʹ4VHHFTUFS༻ͷΫΤϦΛ౤͛Δ

  View Slide

 38. GET /_search
  {
  "suggest": {
  "minne-suggest": {
  "prefix": "minne",
  "completion": {
  "field": "suggest",
  "size": 10
  }
  }
  }
  }

  View Slide

 39. TVHHFTUͱͯ͠ొ࿥͢Δ୯ޠʹ͍ͭͯ

  View Slide

 40. TVHHFTUͱͯ͠ొ࿥͢Δ୯ޠʹ͍ͭͯ
  wߦಈϩά͔ΒաڈO೔ؒͷݕࡧ͞Εͨ୯ޠΛूܭ͢Δ
  wूܭͨ͠୯ޠΛ$PNQMFUJPO4VHHFTUFS༻ͷ୯ޠͱͯ͠ొ࿥͢Δ
  wݕࡧճ਺Λج४ʹXFJHIUΛઃఆ͢Δ

  View Slide

 41. ྨٛޠ΍ࣙॻ؅ཧͷ͓࿩

  View Slide

 42. ྨٛޠ
  w"84ͱ"NB[PO8FC4FSWJDFT౳ͷจࣈͱͯ͠͸ҧ͏͕ҙຯ͕ಉ͡΋ͷΛҰ
  ॹʹѻͬͯ͘ΕΔΑ͏ʹͳΔΞϨ
  wྨٛޠΛࣗ෼Ͱఆٛ͢Δ͜ͱ͕Մೳ
  w4ZOPOZN5PLFO'JMUFSΛ࢖͏

  View Slide

 43. ྨٛޠϦετΛͲ͏࡞͔ͬͨ
  wXJLJQFEJB͔Β࡞੒
  wݕࡧ͞Εͨ୯ޠͷ্Ґ/݅ΛோΊͯϦετʹ௥Ճ
  w࠷ऴతʹ͸ਓྗͰϦετΛνΣοΫ

  View Slide

 44. ྨٛޠϦετͷ՝୊

  View Slide

 45. ྨٛޠϦετͷ՝୊
  wݱঢ়͸ਓྗͳͷͰେม
  wఆظతʹߋ৽͕Ͱ͖͍ͯΔΘ͚Ͱ͸ͳ͍
  w͋Δఔ౓ࣗಈԽΛ͍ͨ͠
  wϋϯυϝΠυϚʔέοτͱ͍͏ಛੑ্ҰൠతͰ͸ͳ͍୯ޠ͕ଟ਺ଘࡏ

  View Slide

 46. ࣙॻ؅ཧ

  View Slide

 47. ࣙॻ
  wܗଶૉղੳΛ௨ͨ͠ࡍʹ૝ఆ֎ͷ୯ޠʹ෼ׂ͞ΕΔ͜ͱ͕͋Δ
  wϋϯυϝΠυͱ͍͏୯ޠ͕ϋϯυͱϝΠυʹ෼ׂ͞ΕΔ ͋͘·Ͱྫ

  w͜ͷΑ͏ͳ໰୊Λղܾͯ͘͠ΕΔͷ͕ࣙॻ

  View Slide

 48. ࣙॻΛͲ͏࡞͔ͬͨ
  wաڈʹݕࡧ͞Εͨ୯ޠͷ্Ґ/݅Λऔಘ
  wऔಘͨ͠୯ޠΛ&MBTUJDTFBSDIͷ"OBMZ[Fʹ͔͚Δ
  w݁Ռͱͯ͠ෳ਺ͷτʔΫϯʹ෼͔Εͨ৔߹ɺࣙॻʹొ࿥ͨ͠ํ͕ྑ͍͔΋͠Ε
  ͳ͍୯ޠͱݴ͑Δ
  w্هͷϦετΛ࡞੒͋͠ͱ͸ਓ͕൑அ

  View Slide

 49. GET /_analyze
  {
  "analyzer": "standard",
  "text" : "this is a test"
  }

  View Slide

 50. ΋͏গָ͍ͨ͠͠

  View Slide

 51. ਏ͍ͷͰࣗಈԽ

  View Slide

 52. ࣙॻ࡞੒ࣗಈԽୈظ
  wલ೔ʹݕࡧ͞Εͨ୯ޠΛूܭɺ্Ґ/݅Λऔಘ
  wऔಘͨ͠୯ޠΛ"OBMZ[Fʹ͔͚Δ
  w"OBMZ[Fʹ͔͚ͨ݁Ռෳ਺ͷτʔΫϯʹ෼ׂ͞Εͨ୯ޠΛTMBDLʹ௨஌
  w্هΛखಈͰࣙॻʹ௥Ճ͢Δ
  wࣙॻ͸QVQQFUͷϦϙδτϦͰ؅ཧ͍ͯ͠Δ
  w։ൃ؀ڥͳͲͰ΋ຊ൪ͱಉࣙ͡ॻ͕࢖͑ΔΑ͏ʹͳΔ

  View Slide

 53. View Slide

 54. ΋͏গָ͍ͨ͠͠

  View Slide

 55. View Slide

 56. ࣙॻ࡞੒ࣗಈԽୈظ
  wલ೔ʹݕࡧ͞Εͨ୯ޠΛूܭɺ্Ґ/݅Λऔಘ
  wऔಘͨ͠୯ޠΛ&MBTUJDTFBSDIͷ"OBMZ[Fʹ͔͚Δ
  w݁Ռͱͯ͠ෳ਺ͷτʔΫϯʹ෼͔Εͨ৔߹ɺ%#ʹอଘ
  wಡΈԾ໊ͳͲ͸खಈͰ؅ཧը໘͔Βొ࿥
  wCBUDIͰ%#͔ΒࣙॻΛ࡞੒͠&MBTUJDTFBSDIʹEFQMPZ
  w্هCBUDIͰQVQQFUͷϦϙδτϦʹ1VMM3FRVFTUΛ࡞੒

  View Slide

 57. ࣙॻʹొ࿥͢Δจࣈͷ͜ͱ͚ͩ
  ߟ͑Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͳͬͨ

  View Slide

 58. ࠓճ঺հͨ͠ࣄྫҎ֎Ͱ΋
  &MBTUJDTFBSDIΛ࢖ͬͯ·͢

  View Slide

 59. ͦͷଞͷࣄྫ
  w࡞Ոʹఏڙ͍ͯ͠ΔΞΫηεղੳػೳ
  w࡞Ոݕࡧ
  w೥લͷͲͷΑ͏ͳ୯ޠͰݕࡧ͞Ε͔ͨʁΛूܭ͢Δ͜ͱʹΑͬͯτϨϯυΛ
  ஌Δ
  wΫϦεϚε޲͚ͷ࡞඼͕೥લ͸Ͳͷ͘Β͍ͷ࣌ظ͔Β୳͞Ε࢝Ί͔ͨͳͲ
  wFUD

  View Slide

 60. ·ͱΊ

  View Slide

 61. ·ͱΊ
  wݕࡧͷΈͰ΋े෼ͳػೳ
  wϩάΛ૊Έ߹ΘͤΔ͜ͱͰߋʹڧྗ
  w"#ςετʹΑΓɺΑΓྑ͍ݕࡧ݁ՌΛ໨ࢦ͢
  w$PNQMFUJPO4VHHFTUFSΛ࢖༻ͨ͠ΦʔτίϯϓϦʔτ
  wྨࣅޠ΍ࣙॻ؅ཧ͸େม
  w͋Δఔ౓ࣗಈԽ͢Δ͜ͱ͸ग़དྷΔ
  w͜͜Ͱ΋ϩά͕͍͖ͯ͘Δ

  View Slide

 62. ͋Γ͕ͱ͏͍͟͝·ͨ͠

  View Slide