Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Create my own search engine.

Create my own search engine.

RubyKaigi 2022 day 2.

seki at druby.org

September 11, 2022
Tweet

More Decks by seki at druby.org

Other Decks in Programming

Transcript

  1. Create my own search engine. seki@ruby-lang.org ࣗ෼༻ͷݕࡧγεςϜΛ࡞Δ࿩

  2. Pokémon TCG similar deck search ϙέΧͷσοΩΛݕࡧ͢ΔγεςϜΛ࡞ͬͨΑʂ 2

  3. Pokémon TCG similar deck search https://github.com/seki/Masaki https://hamana.herokuapp.com/ 3 Heroku App

    Web Browser deck similarity Heroku Scheduler Crowler Search Engine
  4. Pokémon TCG similar deck search πΠʔτ͞Εͨ৽ணσοΩΛݟΒΕΔͧʂࣅ͍ͯΔσοΩͱͷdiff΋Θ͔Δʂʂ 4

  5. Pokémon TCG similar deck search ࣗ෼ͷπΠʔτͨ͠σοΩΛ୳ͤΔ 5

  6. Pokémon TCG similar deck search ࣗ෼ͷมߋཤྺΛௐ΂Δ͜ͱ΋Ͱ͖Δʂ 6

  7. Pokémon TCG similar deck search ʮηΩʯΛ࠾༻͍ͯ͠ΔσοΩΛ୳͢ 7

  8. Agenda About me, ruby and Pokémon TCG Pokémon TCG deck

    similarity whole system 8
  9. Agenda About me, ruby and Pokémon TCG Pokémon TCG implementation

    (engine) - deck vector, cos similarity implementation (system) 9
  10. About me, ruby and Pokémon TCG Masatoshi Seki, Ruby Core

    Committer (dRuby, Rinda, ERB), Programmer 2010 WCS ಢ໦ݝ༧બ༏উ͕།ҰތΕΔ੒੷ 10 Ruby Pokémon TCG 1996 ruby-1.0⭐ Pokémon Red/Blue, Pokémon TCG⭐ 1999 ruby-1.4.0 ERB, dRuby 2006 RubyKaigi 2006 @m_seki started Pokémon TCG⭐ 2010 WCS Tochigi pref. winner⭐ 2022 RubyKaigi 2022 2023 WCS Yokohama Γͬͪ͘ΌΜωϧɺ֏ΠϯλϏϡʔճ
  11. About me, ruby and Pokémon TCG ϙέΧ׆ಈʹ͍ͭͯɺΑΓ͘Θ͘͠͸ͪ͜Β΁ 11 Γͬͪ͘ΌΜωϧɺ֏ΠϯλϏϡʔճ

  12. Agenda About me, ruby and Pokémon TCG Pokémon TCG deck

    similarity whole system 12
  13. Pokémon Trainer Energy Pokémon TCG Build a deck with 60

    cards
  14. www.pokemon-card.com - card search 15717cards card-id = 1..42091 card-id͸42091·Ͱͷ੔਺Ͱɺࣃൈ͚͕͋Γɺશ෦Ͱ15717छྨ 14

  15. www.pokemon-card.com - deck build kkFFfv-WaK14L-VkFkdv deck-code ϋογϡ஋ʁཚ਺ʁ 15

  16. deck internal card-idͱຕ਺ͷλϓϧͰදݱ͞Ε͍ͯΔ ... ͳΜ͔ݟͨ͜ͱ͋Δͧ 16 [[40942, 4], [41111, 2],

    [40616, 2], [41486, 2], [40966, 2], [38020, 2], [39193, 4], [41490, 1], [40992, 2], [40292, 4], [38377, 4], [38128, 1], [39728, 1], [38392, 2], [41340, 2], [40998, 1], [40137, 2], [40304, 2], [40995, 2], [41295, 1], [39652, 3], [40885, 8], [37980, 2], [38002, 4]]
  17. I studied with NLP textbooks! Bag-of-Cards? Bag-of-Words? Vectorized Text !

    ࣗવݴޠॲཧͰݟͨ͜ͱ͋Δ΍ͭʁ 17 [[40942, 4], [41111, 2], [40616, 2], [41486, 2], [40966, 2], [38020, 2], [39193, 4], [41490, 1], [40992, 2], [40292, 4], [38377, 4], [38128, 1], [39728, 1], [38392, 2], [41340, 2], [40998, 1], [40137, 2], [40304, 2], [40995, 2], [41295, 1], [39652, 3], [40885, 8], [37980, 2], [38002, 4]] 💡
  18. Agenda About me, ruby and Pokémon TCG Pokémon TCG deck

    similarity whole system ϕΫτϧԽ͞ΕͨจॻΈ͍ͨͩ 18
  19. ɹsimilar document search (NLP) word segmentation vectorize cosine similarity ॾઆ͋Γ·͢

    📖 {"I" => 2, "like" => 3, "ruby" => 12, ... } v = [0, 0, 0, 0, 4, 2, 24, 0, 0, 1, ...] cos = v1.dot(v2) / (v1.norm * v2.norm)
  20. deck ⊆ natural language text card ≒ word 60 words

    unordered Θ͔ͣ60୯ޠɺޠኮ15717ɺॱংͳ͠ɺ୯ޠ෼ׂෆཁ 20
  21. Vectorization is easy no word segmentation required TF-IDF TF -

    number of copies of the card IDF - infrequently used cards have a higher weight σοΩͷϕΫτϧԽ͸NLPΑΓ؆୯ɻTF-IDFΛϕΫτϧͷ੒෼ʹ࢖͏ͷ΋ಉ͡ɻ 21 v = [0, 0, 0, 0, 4, 2, 24, 0, 0, 1, ...]
  22. Generate deck titles with IDF σοΩͷಛ௃ͷઆ໌ʢ͋·Γ࢖ΘΕ͍ͯͳ͍ΧʔυΛ༏ઌͯ͠දࣔʣʹ΋࢖͏Α 22 sort_by -IDF

  23. Normalization ಉ͡ҙຯͷΧʔυ͕͋ΔͷͰਖ਼نԽ͢Δඞཁ͕͋ΔΑ 23

  24. Normalization ಉ͡ҙຯͷΧʔυ͕͋ΔͷͰਖ਼نԽ͢Δඞཁ͕͋ΔΑ 24 Foil Card Full Art Card

  25. Pokémon Trainer Energy identify the card identify by attribute identify

    by name
  26. card-id normalize dictionary ਖ਼نԽ͢Δͱ9275छྨͷΧʔυʹͳΔɻ 26 [[22032, "SPΤωϧΪʔ"], [22064, 22032], [22261,

    22032], [23027, 22032], [23054, 22032], [23253, 22032], [23268, 22032], [39130, "͍͖ͪ͛ΤωϧΪʔ"], [39242, 39130], [39556, 39130], [40329, 39130], [40879, 39130], [39200, "ΕΜ͖͛ΤωϧΪʔ"], [39245, 39200], [39580, 39200], [40333, 39200], [40483, 39200], {22032=>"SPΤωϧΪʔ", 39130=>"͍͖ͪ͛ΤωϧΪʔ", 39200=>"ΕΜ͖͛ΤωϧΪʔ", .... {22032=>22032, 22064=>22032, 22261=>22032, 23027=>22032, 23054=>22032, 23253=>22032, 23268=>22032, 39130=>39130, 39242=>39130, 39556=>39130, 40329=>39130, 40879=>39130, 39200=>39200, 39245=>39200, 39580=>39200, 40333=>39200, 40483=>39200, 40881=>39200, ... Intermediate data (source) data/uniq_pokemon.txt data/uniq_energy_trainer_all.txt @id_norm card-id to normalized-card-id @name normalized-card-id to name download card page (HTML) scraping and sort in-memory
  27. card-id normalize dictionary diff͕ಡΈ΍͍͢ॻࣜʹͨ͠ 27 [[22032, "SPΤωϧΪʔ"], [22064, 22032], [22261,

    22032], [23027, 22032], [23054, 22032], [23253, 22032], [23268, 22032], [39130, "͍͖ͪ͛ΤωϧΪʔ"], [39242, 39130], [39556, 39130], [40329, 39130], [40879, 39130], [39200, "ΕΜ͖͛ΤωϧΪʔ"], [39245, 39200], [39580, 39200], [40333, 39200], [40483, 39200], Intermediate data (source) data/uniq_pokemon.txt data/uniq_energy_trainer_all.txt download card page (HTML) scraping and sort @@ -1,4 +1,6 @@ -[[39130, "͍͖ͪ͛ΤωϧΪʔ"], +[[42029, "VΨʔυΤωϧΪʔ"], + [42075, 42029], + [39130, "͍͖ͪ͛ΤωϧΪʔ"], [39242, 39130], [39556, 39130], [40329, 39130], @@ -36,6 +38,8 @@ [40323, 37978], [41002, "μϒϧλʔϘΤωϧΪʔ"], [41319, 41002], + [42028, 41002], + [42074, 41002], [37867, "πΠϯΤωϧΪʔ"], [38399, 37867], [38482, 37867], @@ -557,6 +561,7 @@ [35704, 34759], [36967, 34759], [41900, 34759], + [42089, 34759], [38476, "͓ͱͳͷ͓Ͷ͑͞Μ"], git diff
  28. Deck vector 9275-dimensional vector 28 Vector[0, 0, 0, 0, 8,

    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  29. Deck vector implementation sorted array of tuple (normalized-card-id, TF) @idf

    : normalized-card-id → IDF @norm : deck-code → norm 29 [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  30. inner_product implementation intersection of two sets move the smaller cursor

    matched, move both cursor ϚʔδιʔτͷϚʔδ෦෼ͰΑ͘ݟ͔͚Δॲཧ 30 [[4, 8],👈 [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9],👈 [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  31. inner_product implementation move the smaller cursor matched, move both cursor

    31 [[4, 8],👈 [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9],👈 [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  32. inner_product implementation move the smaller cursor matched, move both cursor

    32 [[4, 8], [2111, 2],👈 [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2],👈 [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  33. inner_product implementation move the smaller cursor matched, move both cursor

    33 [[4, 8], [2111, 2],👈 [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2],👈 [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  34. inner_product implementation move the smaller cursor matched, move both cursor

    34 [[4, 8], [2111, 2], [27549, 4]👈 [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2]👈 [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  35. inner_product implementation move the smaller cursor matched, move both cursor

    35 [[4, 8], [2111, 2], [27549, 4]👈 [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4]👈 [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  36. inner_product implementation move the smaller cursor matched, move both cursor

    36 [[4, 8], [2111, 2], [27549, 4], [37497, 2]👈 [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2]👈 [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  37. inner_product implementation move the smaller cursor matched, move both cursor

    37 [[4, 8], [2111, 2], [27549, 4], [37497, 2]👈 [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1]👈 [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  38. inner_product implementation move the smaller cursor matched, move both cursor

    38 [[4, 8], [2111, 2], [27549, 4], [37497, 2]👈 [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3]👈 [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  39. inner_product implementation move the smaller cursor matched, move both cursor

    39 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2]👈 [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3]👈 [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  40. inner_product implementation move the smaller cursor matched, move both cursor

    40 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3]👈 [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  41. inner_product implementation move the smaller cursor matched, move both cursor

    41 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3]👈 [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  42. inner_product implementation move the smaller cursor matched, move both cursor

    42 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4]👈 [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  43. inner_product implementation move the smaller cursor matched, move both cursor

    43 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2]👈 [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  44. inner_product implementation move the smaller cursor matched, move both cursor

    44 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2]👈 [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2]👈 [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  45. inner_product implementation move the smaller cursor matched, move both cursor

    45 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4]👈 [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3]👈 [39191, 1], [39283, 3], [39285, 3], [39334, 1],
  46. inner_product implementation move the smaller cursor matched, move both cursor

    ͓ΘΓ 46 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4]👈 [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3]👈 [39285, 3], [39334, 1],
  47. inner_product implementation intersection of two sets explained it in my

    book. ͦ͏͍͑͹લʹઆ໌ͨ͜͠ͱ͋ͬͨ 47 Start Word def initialize Line number 3 7 8 13 16 2 3 12 13 fwd([‘initialize’, fname, 3) Word def initialize Line number 3 7 8 13 16 2 3 12 13 Forward Both Word def initialize Line number 3 7 8 13 16 2 3 12 13 fwd([‘def’, fname, 12]) Word def initialize Line number 3 7 8 13 16 2 3 12 13 fwd([‘initialize’, fname, 13]) Word def initialize Line number 3 7 8 13 16 2 3 12 13
  48. inner_product implementation TF-IDF Integer idf͸ೋ৐͔͠࢖ͬͯͳ͍ͷͰɺͦͬͪΛϝϞ͓ͯ͘͠΂͖ͩͬͨ 48 idf = @idf[a[ia][0]] s

    += (a[ia][1] * b[ib][1] * idf * idf) TF
  49. cos implementation @norm : deck-code → norm-of-vector do not use

    unit vectors ը૾ॲཧͳͲͰ͸ϕΫτϧΛ୯ҐϕΫτϧʹ͓ͯ͘͜͠ͱ͕ଟ͍ 49 def cos(a, b) left = @deck[a] right = @deck[b] dot(left, right) / (@norm[a] * @norm[b]) end
  50. Basic energy card Basic energy card problem - more than

    4 copies of in their decks - affects similarity s += (a[ia][1].clamp(..5) * b[ib][1].clamp(..5) * idf * idf)
  51. Deck similarity We can calculate deck similarity 51

  52. Search ্Ґn݅Λฦ͢max(n)Λ_ko1ʹڭ͑ͯ΋Βͬͨ! RubyͳΜͰ΋͋Δͳʔɻ 52 max(n) by deck similarity def search(v,

    n=5) norm = vec_to_norm(v) return [] if norm == 0 @deck.map do |b, deck_b| cos = dot(v, deck_b) / (norm * @norm[b]) [cos, b] end.max(n) end
  53. Search by deck, card-id Search by deck code Search by

    card-id 53 search(@deck[code]) search([[card_id, 1]])
  54. Agenda About me, ruby and Pokémon TCG Pokémon TCG deck

    similarity whole system 54
  55. R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down

    at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler @awazekiʢ৘ใσβΠϯʣͷ੡඼Ͱͨ͘͞Μ࢖ͬͯΔʂಢ໦ݝ໼൘ࢢͰ1൪ͷHeroku user 55
  56. R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down

    at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler Don't measure! Feel. 56
  57. R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down

    at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler 57
  58. R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down

    at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler ͍ͭ΋ͳΒdRubyΛ࢖͏ہ໘ 58
  59. System overview 59 Heroku App Web Browser deck similarity Heroku

    Scheduler Crowler Search Engine
  60. Data 60 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  61. Data 61 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  62. Data 62 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  63. Data 63 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  64. Data 64 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  65. Data 65 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  66. Data 66 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  67. Data 67 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  68. Data 68 card normalize map data/uniq_*.txt new deck metadata known

    deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  69. if ... Heroku࢖ͬͯͳ͔ͬͨΒ... 69 card normalize map data/uniq_*.txt new deck

    metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page
  70. without Heroku Heroku࢖ͬͯͳ͔ͬͨΒ...όοΫΞοϓҎ֎ΠϯϝϞϦ 70 card normalize map data/uniq_*.txt known deck

    S3 cron My MacBook App deck similarity Search Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Build page @meta Update Deck dRuby
  71. Create my own search engine. ࣗ෼ͷͨΊͷݕࡧΤϯδϯΛ࡞Δ࿩Λͨ͠Αʂ 71 Ruby Pokémon TCG

    1996 ruby-1.0 Pokémon Red/Blue, Pokémon TCG 1999 ruby-1.4.0 ERB, dRuby 2006 RubyKaigi 2006 @m_seki started Pokémon TCG 2010 WCS Tochigi pref. winner 2022 RubyKaigi 2022 2023 WCS Yokohama
  72. Create your own search engine. ͓ΘΓ 72 Ruby Pokémon TCG

    1996 ruby-1.0 Pokémon Red/Blue, Pokémon TCG 1999 ruby-1.4.0 ERB, dRuby 2006 RubyKaigi 2006 @m_seki started Pokémon TCG 2010 WCS Tochigi pref. winner 2022 RubyKaigi 2022 2023 Create your own search engine WCS Yokohama