Slide 1

Slide 1 text

Create my own search engine. [email protected] ࣗ෼༻ͷݕࡧγεςϜΛ࡞Δ࿩

Slide 2

Slide 2 text

Pokémon TCG similar deck search ϙέΧͷσοΩΛݕࡧ͢ΔγεςϜΛ࡞ͬͨΑʂ 2

Slide 3

Slide 3 text

Pokémon TCG similar deck search https://github.com/seki/Masaki https://hamana.herokuapp.com/ 3 Heroku App Web Browser deck similarity Heroku Scheduler Crowler Search Engine

Slide 4

Slide 4 text

Pokémon TCG similar deck search πΠʔτ͞Εͨ৽ணσοΩΛݟΒΕΔͧʂࣅ͍ͯΔσοΩͱͷdiff΋Θ͔Δʂʂ 4

Slide 5

Slide 5 text

Pokémon TCG similar deck search ࣗ෼ͷπΠʔτͨ͠σοΩΛ୳ͤΔ 5

Slide 6

Slide 6 text

Pokémon TCG similar deck search ࣗ෼ͷมߋཤྺΛௐ΂Δ͜ͱ΋Ͱ͖Δʂ 6

Slide 7

Slide 7 text

Pokémon TCG similar deck search ʮηΩʯΛ࠾༻͍ͯ͠ΔσοΩΛ୳͢ 7

Slide 8

Slide 8 text

Agenda About me, ruby and Pokémon TCG Pokémon TCG deck similarity whole system 8

Slide 9

Slide 9 text

Agenda About me, ruby and Pokémon TCG Pokémon TCG implementation (engine) - deck vector, cos similarity implementation (system) 9

Slide 10

Slide 10 text

About me, ruby and Pokémon TCG Masatoshi Seki, Ruby Core Committer (dRuby, Rinda, ERB), Programmer 2010 WCS ಢ໦ݝ༧બ༏উ͕།ҰތΕΔ੒੷ 10 Ruby Pokémon TCG 1996 ruby-1.0⭐ Pokémon Red/Blue, Pokémon TCG⭐ 1999 ruby-1.4.0 ERB, dRuby 2006 RubyKaigi 2006 @m_seki started Pokémon TCG⭐ 2010 WCS Tochigi pref. winner⭐ 2022 RubyKaigi 2022 2023 WCS Yokohama Γͬͪ͘ΌΜωϧɺ֏ΠϯλϏϡʔճ

Slide 11

Slide 11 text

About me, ruby and Pokémon TCG ϙέΧ׆ಈʹ͍ͭͯɺΑΓ͘Θ͘͠͸ͪ͜Β΁ 11 Γͬͪ͘ΌΜωϧɺ֏ΠϯλϏϡʔճ

Slide 12

Slide 12 text

Agenda About me, ruby and Pokémon TCG Pokémon TCG deck similarity whole system 12

Slide 13

Slide 13 text

Pokémon Trainer Energy Pokémon TCG Build a deck with 60 cards

Slide 14

Slide 14 text

www.pokemon-card.com - card search 15717cards card-id = 1..42091 card-id͸42091·Ͱͷ੔਺Ͱɺࣃൈ͚͕͋Γɺશ෦Ͱ15717छྨ 14

Slide 15

Slide 15 text

www.pokemon-card.com - deck build kkFFfv-WaK14L-VkFkdv deck-code ϋογϡ஋ʁཚ਺ʁ 15

Slide 16

Slide 16 text

deck internal card-idͱຕ਺ͷλϓϧͰදݱ͞Ε͍ͯΔ ... ͳΜ͔ݟͨ͜ͱ͋Δͧ 16 [[40942, 4], [41111, 2], [40616, 2], [41486, 2], [40966, 2], [38020, 2], [39193, 4], [41490, 1], [40992, 2], [40292, 4], [38377, 4], [38128, 1], [39728, 1], [38392, 2], [41340, 2], [40998, 1], [40137, 2], [40304, 2], [40995, 2], [41295, 1], [39652, 3], [40885, 8], [37980, 2], [38002, 4]]

Slide 17

Slide 17 text

I studied with NLP textbooks! Bag-of-Cards? Bag-of-Words? Vectorized Text ! ࣗવݴޠॲཧͰݟͨ͜ͱ͋Δ΍ͭʁ 17 [[40942, 4], [41111, 2], [40616, 2], [41486, 2], [40966, 2], [38020, 2], [39193, 4], [41490, 1], [40992, 2], [40292, 4], [38377, 4], [38128, 1], [39728, 1], [38392, 2], [41340, 2], [40998, 1], [40137, 2], [40304, 2], [40995, 2], [41295, 1], [39652, 3], [40885, 8], [37980, 2], [38002, 4]] 💡

Slide 18

Slide 18 text

Agenda About me, ruby and Pokémon TCG Pokémon TCG deck similarity whole system ϕΫτϧԽ͞ΕͨจॻΈ͍ͨͩ 18

Slide 19

Slide 19 text

ɹsimilar document search (NLP) word segmentation vectorize cosine similarity ॾઆ͋Γ·͢ 📖 {"I" => 2, "like" => 3, "ruby" => 12, ... } v = [0, 0, 0, 0, 4, 2, 24, 0, 0, 1, ...] cos = v1.dot(v2) / (v1.norm * v2.norm)

Slide 20

Slide 20 text

deck ⊆ natural language text card ≒ word 60 words unordered Θ͔ͣ60୯ޠɺޠኮ15717ɺॱংͳ͠ɺ୯ޠ෼ׂෆཁ 20

Slide 21

Slide 21 text

Vectorization is easy no word segmentation required TF-IDF TF - number of copies of the card IDF - infrequently used cards have a higher weight σοΩͷϕΫτϧԽ͸NLPΑΓ؆୯ɻTF-IDFΛϕΫτϧͷ੒෼ʹ࢖͏ͷ΋ಉ͡ɻ 21 v = [0, 0, 0, 0, 4, 2, 24, 0, 0, 1, ...]

Slide 22

Slide 22 text

Generate deck titles with IDF σοΩͷಛ௃ͷઆ໌ʢ͋·Γ࢖ΘΕ͍ͯͳ͍ΧʔυΛ༏ઌͯ͠දࣔʣʹ΋࢖͏Α 22 sort_by -IDF

Slide 23

Slide 23 text

Normalization ಉ͡ҙຯͷΧʔυ͕͋ΔͷͰਖ਼نԽ͢Δඞཁ͕͋ΔΑ 23

Slide 24

Slide 24 text

Normalization ಉ͡ҙຯͷΧʔυ͕͋ΔͷͰਖ਼نԽ͢Δඞཁ͕͋ΔΑ 24 Foil Card Full Art Card

Slide 25

Slide 25 text

Pokémon Trainer Energy identify the card identify by attribute identify by name

Slide 26

Slide 26 text

card-id normalize dictionary ਖ਼نԽ͢Δͱ9275छྨͷΧʔυʹͳΔɻ 26 [[22032, "SPΤωϧΪʔ"], [22064, 22032], [22261, 22032], [23027, 22032], [23054, 22032], [23253, 22032], [23268, 22032], [39130, "͍͖ͪ͛ΤωϧΪʔ"], [39242, 39130], [39556, 39130], [40329, 39130], [40879, 39130], [39200, "ΕΜ͖͛ΤωϧΪʔ"], [39245, 39200], [39580, 39200], [40333, 39200], [40483, 39200], {22032=>"SPΤωϧΪʔ", 39130=>"͍͖ͪ͛ΤωϧΪʔ", 39200=>"ΕΜ͖͛ΤωϧΪʔ", .... {22032=>22032, 22064=>22032, 22261=>22032, 23027=>22032, 23054=>22032, 23253=>22032, 23268=>22032, 39130=>39130, 39242=>39130, 39556=>39130, 40329=>39130, 40879=>39130, 39200=>39200, 39245=>39200, 39580=>39200, 40333=>39200, 40483=>39200, 40881=>39200, ... Intermediate data (source) data/uniq_pokemon.txt data/uniq_energy_trainer_all.txt @id_norm card-id to normalized-card-id @name normalized-card-id to name download card page (HTML) scraping and sort in-memory

Slide 27

Slide 27 text

card-id normalize dictionary diff͕ಡΈ΍͍͢ॻࣜʹͨ͠ 27 [[22032, "SPΤωϧΪʔ"], [22064, 22032], [22261, 22032], [23027, 22032], [23054, 22032], [23253, 22032], [23268, 22032], [39130, "͍͖ͪ͛ΤωϧΪʔ"], [39242, 39130], [39556, 39130], [40329, 39130], [40879, 39130], [39200, "ΕΜ͖͛ΤωϧΪʔ"], [39245, 39200], [39580, 39200], [40333, 39200], [40483, 39200], Intermediate data (source) data/uniq_pokemon.txt data/uniq_energy_trainer_all.txt download card page (HTML) scraping and sort @@ -1,4 +1,6 @@ -[[39130, "͍͖ͪ͛ΤωϧΪʔ"], +[[42029, "VΨʔυΤωϧΪʔ"], + [42075, 42029], + [39130, "͍͖ͪ͛ΤωϧΪʔ"], [39242, 39130], [39556, 39130], [40329, 39130], @@ -36,6 +38,8 @@ [40323, 37978], [41002, "μϒϧλʔϘΤωϧΪʔ"], [41319, 41002], + [42028, 41002], + [42074, 41002], [37867, "πΠϯΤωϧΪʔ"], [38399, 37867], [38482, 37867], @@ -557,6 +561,7 @@ [35704, 34759], [36967, 34759], [41900, 34759], + [42089, 34759], [38476, "͓ͱͳͷ͓Ͷ͑͞Μ"], git diff

Slide 28

Slide 28 text

Deck vector 9275-dimensional vector 28 Vector[0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

Slide 29

Slide 29 text

Deck vector implementation sorted array of tuple (normalized-card-id, TF) @idf : normalized-card-id → IDF @norm : deck-code → norm 29 [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 30

Slide 30 text

inner_product implementation intersection of two sets move the smaller cursor matched, move both cursor ϚʔδιʔτͷϚʔδ෦෼ͰΑ͘ݟ͔͚Δॲཧ 30 [[4, 8],👈 [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9],👈 [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 31

Slide 31 text

inner_product implementation move the smaller cursor matched, move both cursor 31 [[4, 8],👈 [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9],👈 [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 32

Slide 32 text

inner_product implementation move the smaller cursor matched, move both cursor 32 [[4, 8], [2111, 2],👈 [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2],👈 [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 33

Slide 33 text

inner_product implementation move the smaller cursor matched, move both cursor 33 [[4, 8], [2111, 2],👈 [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2],👈 [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 34

Slide 34 text

inner_product implementation move the smaller cursor matched, move both cursor 34 [[4, 8], [2111, 2], [27549, 4]👈 [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2]👈 [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 35

Slide 35 text

inner_product implementation move the smaller cursor matched, move both cursor 35 [[4, 8], [2111, 2], [27549, 4]👈 [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4]👈 [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 36

Slide 36 text

inner_product implementation move the smaller cursor matched, move both cursor 36 [[4, 8], [2111, 2], [27549, 4], [37497, 2]👈 [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2]👈 [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 37

Slide 37 text

inner_product implementation move the smaller cursor matched, move both cursor 37 [[4, 8], [2111, 2], [27549, 4], [37497, 2]👈 [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1]👈 [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 38

Slide 38 text

inner_product implementation move the smaller cursor matched, move both cursor 38 [[4, 8], [2111, 2], [27549, 4], [37497, 2]👈 [37501, 2], [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3]👈 [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 39

Slide 39 text

inner_product implementation move the smaller cursor matched, move both cursor 39 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2]👈 [37667, 4], [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3]👈 [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 40

Slide 40 text

inner_product implementation move the smaller cursor matched, move both cursor 40 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3]👈 [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 41

Slide 41 text

inner_product implementation move the smaller cursor matched, move both cursor 41 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3]👈 [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 42

Slide 42 text

inner_product implementation move the smaller cursor matched, move both cursor 42 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4]👈 [37976, 2], [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 43

Slide 43 text

inner_product implementation move the smaller cursor matched, move both cursor 43 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4]👈 [37976, 2], [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2]👈 [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 44

Slide 44 text

inner_product implementation move the smaller cursor matched, move both cursor 44 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2]👈 [37978, 4], [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2]👈 [38131, 3], [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 45

Slide 45 text

inner_product implementation move the smaller cursor matched, move both cursor 45 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4]👈 [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3]👈 [39191, 1], [39283, 3], [39285, 3], [39334, 1],

Slide 46

Slide 46 text

inner_product implementation move the smaller cursor matched, move both cursor ͓ΘΓ 46 [[4, 8], [2111, 2], [27549, 4], [37497, 2], [37501, 2], [37667, 4], [37976, 2], [37978, 4]👈 [37980, 2], [38128, 1], [38230, 4], [38232, 2], [39652, 3], [39728, 1], [39762, 2], [39763, 2], [40942, 4], [[4, 9], [1114, 2], [2111, 2], [25254, 2], [27549, 4], [36161, 2], [36356, 1], [37497, 3], [37515, 3], [37628, 3], [37632, 4], [37976, 2], [38131, 3], [39191, 1], [39283, 3]👈 [39285, 3], [39334, 1],

Slide 47

Slide 47 text

inner_product implementation intersection of two sets explained it in my book. ͦ͏͍͑͹લʹઆ໌ͨ͜͠ͱ͋ͬͨ 47 Start Word def initialize Line number 3 7 8 13 16 2 3 12 13 fwd([‘initialize’, fname, 3) Word def initialize Line number 3 7 8 13 16 2 3 12 13 Forward Both Word def initialize Line number 3 7 8 13 16 2 3 12 13 fwd([‘def’, fname, 12]) Word def initialize Line number 3 7 8 13 16 2 3 12 13 fwd([‘initialize’, fname, 13]) Word def initialize Line number 3 7 8 13 16 2 3 12 13

Slide 48

Slide 48 text

inner_product implementation TF-IDF Integer idf͸ೋ৐͔͠࢖ͬͯͳ͍ͷͰɺͦͬͪΛϝϞ͓ͯ͘͠΂͖ͩͬͨ 48 idf = @idf[a[ia][0]] s += (a[ia][1] * b[ib][1] * idf * idf) TF

Slide 49

Slide 49 text

cos implementation @norm : deck-code → norm-of-vector do not use unit vectors ը૾ॲཧͳͲͰ͸ϕΫτϧΛ୯ҐϕΫτϧʹ͓ͯ͘͜͠ͱ͕ଟ͍ 49 def cos(a, b) left = @deck[a] right = @deck[b] dot(left, right) / (@norm[a] * @norm[b]) end

Slide 50

Slide 50 text

Basic energy card Basic energy card problem - more than 4 copies of in their decks - affects similarity s += (a[ia][1].clamp(..5) * b[ib][1].clamp(..5) * idf * idf)

Slide 51

Slide 51 text

Deck similarity We can calculate deck similarity 51

Slide 52

Slide 52 text

Search ্Ґn݅Λฦ͢max(n)Λ_ko1ʹڭ͑ͯ΋Βͬͨ! RubyͳΜͰ΋͋Δͳʔɻ 52 max(n) by deck similarity def search(v, n=5) norm = vec_to_norm(v) return [] if norm == 0 @deck.map do |b, deck_b| cos = dot(v, deck_b) / (norm * @norm[b]) [cos, b] end.max(n) end

Slide 53

Slide 53 text

Search by deck, card-id Search by deck code Search by card-id 53 search(@deck[code]) search([[card_id, 1]])

Slide 54

Slide 54 text

Agenda About me, ruby and Pokémon TCG Pokémon TCG deck similarity whole system 54

Slide 55

Slide 55 text

R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler @awazekiʢ৘ใσβΠϯʣͷ੡඼Ͱͨ͘͞Μ࢖ͬͯΔʂಢ໦ݝ໼൘ࢢͰ1൪ͷHeroku user 55

Slide 56

Slide 56 text

R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler Don't measure! Feel. 56

Slide 57

Slide 57 text

R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler 57

Slide 58

Slide 58 text

R.I.P. Heroku Free Dyno Dyno VM 512MB/1core (Free) Shut down at least once every 24 hours Heroku Postgres: Hobby Basic Heroku Scheduler ͍ͭ΋ͳΒdRubyΛ࢖͏ہ໘ 58

Slide 59

Slide 59 text

System overview 59 Heroku App Web Browser deck similarity Heroku Scheduler Crowler Search Engine

Slide 60

Slide 60 text

Data 60 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 61

Slide 61 text

Data 61 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 62

Slide 62 text

Data 62 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 63

Slide 63 text

Data 63 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 64

Slide 64 text

Data 64 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 65

Slide 65 text

Data 65 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 66

Slide 66 text

Data 66 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 67

Slide 67 text

Data 67 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 68

Slide 68 text

Data 68 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 69

Slide 69 text

if ... Heroku࢖ͬͯͳ͔ͬͨΒ... 69 card normalize map data/uniq_*.txt new deck metadata known deck S3 Heroku Scheduler Heroku PG My MacBook Heroku App deck similarity Search Initialize Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Update Deck Build page

Slide 70

Slide 70 text

without Heroku Heroku࢖ͬͯͳ͔ͬͨΒ...όοΫΞοϓҎ֎ΠϯϝϞϦ 70 card normalize map data/uniq_*.txt known deck S3 cron My MacBook App deck similarity Search Crowler Web UI @deck @idf @norm @name @id_norm Make Vector Build page @meta Update Deck dRuby

Slide 71

Slide 71 text

Create my own search engine. ࣗ෼ͷͨΊͷݕࡧΤϯδϯΛ࡞Δ࿩Λͨ͠Αʂ 71 Ruby Pokémon TCG 1996 ruby-1.0 Pokémon Red/Blue, Pokémon TCG 1999 ruby-1.4.0 ERB, dRuby 2006 RubyKaigi 2006 @m_seki started Pokémon TCG 2010 WCS Tochigi pref. winner 2022 RubyKaigi 2022 2023 WCS Yokohama

Slide 72

Slide 72 text

Create your own search engine. ͓ΘΓ 72 Ruby Pokémon TCG 1996 ruby-1.0 Pokémon Red/Blue, Pokémon TCG 1999 ruby-1.4.0 ERB, dRuby 2006 RubyKaigi 2006 @m_seki started Pokémon TCG 2010 WCS Tochigi pref. winner 2022 RubyKaigi 2022 2023 Create your own search engine WCS Yokohama