プログラミングLT_2019_はじめての推薦システム.pdf / introduction-to-recommender-system

Yuya Matsumura @yu-ya4 ͸͡ΊͯͷਪનγεςϜ ϓϩάϥϛϯάLT 2019 30 Apr. 2019

✓ Yuya Matsumura ✓ Software Engineer ✓ Wantedly, Inc. Recommendation
Team ✓ Interested in Information Retrieval, Machine Learning Self-Introduction @yu-ya4 @yu__ya4

ਪનγεςϜ͓ͬͯ΋͠Ζͦ͏ʂʘ(^o^)ʗ ͬͯͪΐͬͱͰ΋ڵຯΛ΋͍͚ͬͯͨͩΔͱخ͍͠Ͱ͢! ࠓ೔ͷΰʔϧ

ਪનγεςϜͱ͸ ਪનख๏ͷ෼ྨ γϯϓϧͳڠௐϑΟϧλϦϯάͷ࣮૷ ·ͱΊ Agenda

It is often necessary to make choices without sufficient personal
experience of the alternatives. In everyday life, we rely on recommendations from other people either by word of mouth, recommendation letters, movie and book reviews printed in newspapers, or general surveys such as Zagat’s restaurant guides. Recommender systems assist and augment this natural social process. ਪનγεςϜͱ͸ [Resnick+ 97] ࣗ෼ͷܦݧ͔ΒͷΈͰ͸͋·ΓΑ͘Θ͔Βͳ͍΋ͷͷத͔ΒɼͲ͏ ͯ͠΋ͲΕ͔Λબ͹ͳ͚Ε͹ͳΒͳ͍ͱ͍͏͜ͱ͸Α͋͘Δɻ͜ͷ Α͏ͳࡍΘͨͨͪ͠͸ɼޱίϛɼਪનঢ়ɼ ৽ฉͷॻධ΍өըධɼϨ ετϥϯΨΠυͳͲͷଞਓ͔ΒͷਪનʹཔΔ͜ͱΛ ೔ৗతʹߦͬͯ ͍Δɻ ਪનγεςϜͱ͸ɼ͜͏ͨࣾ͠ձͰී௨ʹߦΘΕ͍ͯΔҰ࿈ ͷߦҝΛิॿͨ͠Γɼଅਐͨ͠Γ͢Δ΋ͷͰ͋Δɻ [Resnick 97] P. Resnick and H. R. Varian. Recommender systems. Communications of the ACM, Vol. 40, No. 3, pp. 56–58, 1997.

Recommenders: Tools to help identify worthwhile stuff ਪનγεςϜͱ͸ [Konstan+ 03]
ਪનγεςϜ: ͲΕ͕Ձ஋ͷ͋Δ΋ͷ͔ಛఆ͢ΔͷΛॿ͚Δಓ۩ [Konstan 03] J. A. Konstan and J. Riedl. Recommender systems: Collaborating in commerce and communities. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, Tutorial, 2003.

ಛఆͷϢʔβͷᅂ޷ʹ ߹ͬͨ঎඼ͷਪન ਪનγεςϜͷྫ (amazon.co.jp)

ྨࣅɾؔ࿈঎඼ ͷਪન (amazon.co.jp) ଞͷϢʔβʹΑΔධՁ ਪનγεςϜͷྫ

ਓؾ঎඼ͷਪન (amazon.co.jp) ৽ண঎඼ͷਪન ਪનγεςϜͷྫ

Recommenders: Tools to help identify worthwhile stuff ࠶ܝɿਪનγεςϜͱ͸ [Konstan+ 03]
ਪનγεςϜ: ͲΕ͕Ձ஋ͷ͋Δ΋ͷ͔ಛఆ͢ΔͷΛॿ͚Δಓ۩ [Konstan 03] J. A. Konstan and J. Riedl. Recommender systems: Collaborating in commerce and communities. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, Tutorial, 2003.

ਪનγεςϜͷٕज़ཁૉ 㾎 ώϡʔϚϯΠϯλϑΣʔε w ΞΠςϜͷ಺༰΍ਪનʹؔ࿈ͨ͠৘ใΛఏࣔ w ਪનʹඞཁͳ৘ใΛར༻ऀ͔Βత֬ʹऩू 㾎 ػցֶशɾ౷ܭత༧ଌɾ৘ใݕࡧʢਪનΞϧΰϦζϜʣ w
ऩूͨ͠σʔλʹجͮ͘ਪન৘ใͷੜ੒ w ໨తʹԠͨ͡ਪન৘ใͷม׵ 㾎 σʔλϕʔεɾฒྻܭࢉɾωοτϫʔΫ w ਪનʹඞཁͳ্෍Λ஝ੵɾॲཧɻྲྀ௨ͤ͞Δج൫ http://www.kamishima.net/archive/recsysdoc.pdf

ਪનγεςϜͷٕज़ཁૉ 㾎 ώϡʔϚϯΠϯλϑΣʔε w ΞΠςϜͷ಺༰΍ਪનʹؔ࿈ͨ͠৘ใΛఏࣔ w ਪનʹඞཁͳ৘ใΛར༻ऀ͔Βత֬ʹऩू 㾎 ػցֶशɾ౷ܭత༧ଌɾ৘ใݕࡧʢਪનΞϧΰϦζϜʣ w
ऩूͨ͠σʔλʹجͮ͘ਪન৘ใͷੜ੒ w ໨తʹԠͨ͡ਪન৘ใͷม׵ 㾎 σʔλϕʔεɾฒྻܭࢉɾωοτϫʔΫ w ਪનʹඞཁͳ্෍Λ஝ੵɾॲཧɻྲྀ௨ͤ͞Δج൫ http://www.kamishima.net/archive/recsysdoc.pdf ͜͜ʹ͍ͭͯ࿩͠·͢"

ਪનख๏ͷେ·͔ͳ෼ྨ ڠௐϑΟϧλϦϯά ಺༰(ίϯςϯπ)ϕʔε ϑΟϧλϦϯά ϙϐϡϥϦςΟ ϋΠϒϦου MIX, ͍͍ͱ͜औΓ ଞͷϢʔβͷߦಈ(ߪങɼӾཡ etc.)ཤྺΛར༻
Ϣʔβ΍ίϯςϯπ(঎඼)ͷ ৘ใΛར༻ ϧʔϧϕʔε ਓؾϥϯΩϯά

ਪનख๏ͷେ·͔ͳ෼ྨ ڠௐϑΟϧλϦϯά ϙϐϡϥϦςΟ ϋΠϒϦου MIX, ͍͍ͱ͜औΓ ଞͷϢʔβͷߦಈ(ߪങɼӾཡ etc.)ཤྺΛར༻ ਓؾϥϯΩϯά ಺༰(ίϯςϯπ)ϕʔε
ϑΟϧλϦϯά Ϣʔβ΍ίϯςϯπ(঎඼)ͷ ৘ใΛར༻

಺༰ϕʔεϑΟϧλϦϯά ϢʔβͷϓϩϑΝΠϧͱΞΠςϜͷಛ௃ͱͷϚονϯάʹΑΓɼ Ϣʔβͷᅂ޷ʹ߹க͢ΔΞΠςϜΛਪન ϢʔβϓϩϑΝΠϧ ✖ ΞΠςϜͷಛ௃ ࿀Ѫ ϛεςϦʔ 90೥୅ͷ࡞඼ ࠷৽ͷ࡞࿀ѪՈ
୹ฤू ௕ฤ δϟϯϧ: SF ࡞ऀ: ాத ଠ࿠ ೉қ౓: ೉͍͠ Ϣʔβ͕޷Ή(աڈʹ޷Μͩ)ΞΠςϜͷಛ௃ δϟϯϧɼ࡞ऀɼՁ֨ͳͲͷΞΠςϜͷϝλσʔλ΍ɼ ղੳʹΑΓநग़ͨ͠ಛ௃ δϟϯϧ: ϛεςϦʔ ࡞ऀ: ླ໦ Ұ࿠ ೉қ౓: қ͍͠

ϢʔβϓϩϑΝΠϧͷ֫ಘํ๏ ໌ࣔత֫ಘʢExplicit Feedbackʣ Ϣʔβ͕5ஈ֊ͷई౓ͳͲͰΞΠςϜΛ௚઀ධՁ ҉໧త֫ಘʢImplicit Feedbackʣ Ϣʔβͷߦಈ͔ΒΞΠςϜ΁ͷධՁ஋Λ༧ଌ ߪೖ ঎඼ϖʔδӾཡ ͓ؾʹೖΓొ࿥

ΞΠςϜͷಛ௃ͷ֫ಘํ๏ ର৅ͱͳΔίϯςϯπʹΑ༷ͬͯʑͳ֫ಘํ๏ ੡඼ λΠτϧɼδϟϯϧɼ จࣈ਺ɼ࡞ऀɼग़൛೔ɼ…. ଐੑऔಘ Իָ Ի੠ղੳ ը૾ ը૾ղੳ

ਪનख๏ͷେ·͔ͳ෼ྨ ಺༰(ίϯςϯπ)ϕʔε ϑΟϧλϦϯά ϙϐϡϥϦςΟ ϋΠϒϦου MIX, ͍͍ͱ͜औΓ Ϣʔβ΍ίϯςϯπ(঎඼)ͷ ৘ใΛར༻ ਓؾϥϯΩϯά
ڠௐϑΟϧλϦϯά ଞͷϢʔβͷߦಈ(ߪങɼӾཡ etc.)ཤྺΛར༻

ڠௐϑΟϧλϦϯά ଞͷϢʔβͷߦಈཤྺΛར༻ͯ͠ ᅂ޷͕ࣅ͍ͯΔϢʔβ͕બΜͩ΋ͷΛਪન ʹᅂ޷͕ࣅ͍ͯΔϢʔβ ޷͖ʂʂ ݏ͍ʂʂ ਪન ✖ ⭕

ᅂ޷͕ࣅ͍ͯΔϢʔβ ਪન͍ͨ͠λʔήοτϢʔβͱࣅ͍ͯΔߦಈ (ߪೖɼϖʔδӾཡɼ͓ؾʹೖΓ etc.)ΛऔͬͨଞͷϢʔβ ✖ ⭕ ׆ಈར༻ऀ (ਪન͍ͨ͠λʔήοτ) ඪຊར༻ऀ(ଞͷϢʔβ) ⭕
⭕ ✖ ⭕ ⭕ ✖ ˚ ✖ ✖ ⭕ ࣅ͍ͯΔ ࣅ͍ͯͳ͍

ࠓճ͸͜ͷڠௐϑΟϧλϦϯάͷ؆୯ͳ࣮૷Λ঺հ͠·͢!

ϢʔβͲ͏͠ͷྨࣅ౓Λܭࢉͯ͠ɼᅂ޷ͷࣅ͍ͯΔϢʔβΛݟͭ ͚Δɻ ᅂ޷ͷࣅ͍ͯΔϢʔβͷߦಈཤྺ͔Βɼະ஌ͷΞΠςϜʹର͢Δ ༧ଌධՁ஋Λܭࢉ͢Δɻ ༧ଌධՁ஋ͷߴ͍΋ͷΛϢʔβʹਪન͢Δɻ ڠௐϑΟϧλϦϯάʢϢʔβϕʔεʣͷେ·͔ͳྲྀΕ https://github.com/yu-ya4/recommend
ʹҰԠ࣮૷Λެ։͠·ͨ͠

ڠௐϑΟϧλϦϯάʢϢʔβϕʔεʣͷେ·͔ͳྲྀΕ https://github.com/yu-ya4/recommend ʹҰԠ࣮૷Λެ։͠·ͨ͠ ϢʔβͲ͏͠ͷྨࣅ౓Λܭࢉͯ͠ɼᅂ޷ͷࣅ͍ͯΔϢʔβΛݟͭ ͚Δɻ ᅂ޷ͷࣅ͍ͯΔϢʔβͷߦಈཤྺ͔Βɼະ஌ͷΞΠςϜʹର͢Δ ༧ଌධՁ஋Λܭࢉ͢Δɻ
༧ଌධՁ஋ͷߴ͍΋ͷΛϢʔβʹਪન͢Δɻ

⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ʁ ⭕ Ξ
Π ς Ϝ Ϣʔβ 1. ϢʔβͲ͏͠ͷྨࣅ౓Λܭࢉͯ͠ɼᅂ޷ͷࣅ͍ͯΔϢʔβΛݟ͚ͭΔɻ ✖ ✖ ✖ ✖ ✖ ⭕ ⭕ ⭕ ʁ ✖ ⭕ ✖ A B C D E ứ ϋ Ϧ ϙ λ Ừ ứ ޿ ࣙ ԓ Ừ ứ ౷ ܭ ֶ ೖ ໳ Ừ ứ ࢦ ྠ ෺ ޠ Ừ ứ φ ϧ χ Ξ Ừ - - - ✖ # ਪન͞ΕͨΞΠςϜʹ͍ͭͯͷ৘ใͷΈଘࡏ͢Δɻ # ߪೖͨ͠Β 1, ߪೖ͠ͳ͔ͬͨΒ 0 data = { "A": { "φϧχΞ": 1, "ϋϦϙλ": 1, "޿ࣙԓ": 0, "ࢦྠ෺ޠ": 1, "౷ܭֶೖ໳": 0 }, "B": { "ϋϦϙλ": 0, "޿ࣙԓ": 1, "ࢦྠ෺ޠ": 0, "౷ܭֶೖ໳": 1 }, "C": { "φϧχΞ": 1, "޿ࣙԓ": 0, "ࢦྠ෺ޠ": 1 }, "D": { "φϧχΞ": 1, "ϋϦϙλ": 0, "޿ࣙԓ": 1, "౷ܭֶೖ໳": 1 }, "E": { "φϧχΞ": 0, "޿ࣙԓ": 1, "ࢦྠ෺ޠ": 1, "౷ܭֶೖ໳": 0 }, } ⭕ɿߪೖ ✖ɿਪન͞Εͨʢݟͨʣ͕ະߪೖ - ɿະਪનɻݟͨ͜ͱͳ͍ɻ

# δϟοΧʔυ܎਺Λܭࢉ def jaccard_similarity(x, y): intersection = len(x.intersection(y)) union =
len(x.union(y)) return float(intersection / union) ࠓճ͸ྨࣅ౓ई౓ͷܭࢉʹδϟοΧʔυ܎਺ʢJaccard indexʣΛར༻ →ʮ̎ͭͷू߹ʹؚ·Ε͍ͯΔཁૉͷ͏ͪڞ௨ཁૉ͕઎ΊΔׂ߹ʯΛද͢ɻ Ϣʔβ͕ߪೖͨ͠ΞΠςϜͷू߹Ͳ͏͠ͷ δϟοΧʔυ܎਺Λྨࣅ౓ʹ → ಉ͡ΞΠςϜΛߪೖ͍ͯ͠ΔϢʔβ΄Ͳྨࣅ a = set(key for key, val in data["A"].items() if val == 1) c = set(key for key, val in data["C"].items() if val == 1) >>> a {'φϧχΞ', 'ࢦྠ෺ޠ', 'ϋϦϙλ'} >>> c {'φϧχΞ', 'ࢦྠ෺ޠ'} >>> jaccard_similarity(a, c) 0.6666666666666666 1. ϢʔβͲ͏͠ͷྨࣅ౓Λܭࢉͯ͠ɼᅂ޷ͷࣅ͍ͯΔϢʔβΛݟ͚ͭΔɻ

# աڈͷߦಈཤྺʹج͍ͮͨ 2 Ϣʔβͷྨࣅ౓Λܭࢉ def get_similarity(user1, user2): # Ϣʔβͷߦಈཤྺ history1
= data[user1] history2 = data[user2] # ྆ํͱ΋ͷϢʔβʹਪન͞ΕͨΞΠςϜͷू߹ΛͱΔɻ recommended_items1 = set(history1.keys()) recommended_items2 = set(history2.keys()) recommended_both = recommended_items1.intersection(recommended_items2) #྆ํͱ΋ʹਪન͞ΕͨΞΠςϜ͕ͳ͚Ε͹ྨࣅ౓ 0 ͱ͢Δɻ if len(recommended_both) == 0: return 0.0 # ྆ํͱ΋ͷϢʔβʹਪન͞Ε͍͔ͯͯͭɼߪೖͨ͠ΞΠςϜͷू߹ΛͱΔɻ possitive_items1 = set([key for key, val in history1.items() if key in recommended_both and val == 1]) possitive_items2 = set([key for key, val in history2.items() if key in recommended_both and val == 1]) return jaccard_similarity(possitive_items1, possitive_items2) ϢʔβͲ͏͠ͷྨࣅ౓ͷܭࢉ 1. ϢʔβͲ͏͠ͷྨࣅ౓Λܭࢉͯ͠ɼᅂ޷ͷࣅ͍ͯΔϢʔβΛݟ͚ͭΔɻ

⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ʁ ⭕ Ξ
Π ς Ϝ Ϣʔβ ✖ ✖ ✖ ✖ ✖ ⭕ ⭕ ⭕ ʁ ✖ ⭕ ✖ A B C D ứ ϋ Ϧ ϙ λ Ừ ứ ޿ ࣙ ԓ Ừ ứ ౷ ܭ ֶ ೖ ໳ Ừ ứ ࢦ ྠ ෺ ޠ Ừ ứ φ ϧ χ Ξ Ừ - - - ✖ sim = get_similarity("C", "E") >>> sim 0.3333333333333333 E 1. ϢʔβͲ͏͠ͷྨࣅ౓Λܭࢉͯ͠ɼᅂ޷ͷࣅ͍ͯΔϢʔβΛݟ͚ͭΔɻ ྆ํͱ΋ͷϢʔβʹਪન͞Ε͍͔ͯͯͭɼ ߪೖ͞ΕͨΞΠςϜͷू߹Ͳ͏͠ͷδϟοΧʔυ܎਺Λྨࣅ౓ͱ͢Δ

def get_recommends(user): # ର৅Ϣʔβ͕·ͩਪન͞Ε͍ͯͳ͍ΞΠςϜͷू߹ɿਪનର৅ΞΠςϜ new_items = all_items - set(data[user].keys()) #
ਪનର৅ΞΠςϜͷ༧ଌධՁ஋ΛೖΕΔശ sum_scores = {item: 0.0 for item in new_items} sum_sim = {item: 0 for item in new_items} # # ର৅ϢʔβҎ֎ͷϢʔβ others = list(data.keys()); others.remove(user) for other in others: sim = get_similarity(user, other) for new_item in new_items: if new_item in data[other]: # "ධՁ஋" × "ྨࣅ౓" Λਪન౓ͷείΞͱͯ͠ɼશϢʔβʹ͍ͭͯ߹ܭ͢Δ sum_scores[new_item] += data[other][new_item] * sim # ΞΠςϜ͝ͱͷϢʔβͷྨࣅ౓ͷ߹ܭΛܭࢉ͓͖ͯ͠ɼ্هͷείΞΛׂΔʢՃॏฏۉͷΠϝʔδʣ sum_sim[new_item] += sim recommends = {item: sum_scores[item] / sum_sim[item] for item in new_items} recommends = sorted(recommends.items(), key=lambda x:x[1], reverse=True) return recommends 2. ᅂ޷ͷࣅ͍ͯΔϢʔβͷߦಈཤྺ͔Βɼະ஌ͷΞΠςϜʹର͢Δ༧ଌධՁ஋Λܭࢉ͢Δɻ ྨࣅ౓ͷߴ͍ϢʔβͷධՁ΄ͲॏΈ͕ߴ͘ͳΔΑ͏ʹ༧ଌධՁ஋Λܭࢉ

⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ʁ ⭕ Ξ
Π ς Ϝ Ϣʔβ ✖ ✖ ✖ ✖ ✖ ⭕ ⭕ ⭕ ʁ ✖ ⭕ ✖ A B C D E ứ ϋ Ϧ ϙ λ Ừ ứ ޿ ࣙ ԓ Ừ ứ ౷ ܭ ֶ ೖ ໳ Ừ ứ ࢦ ྠ ෺ ޠ Ừ ứ φ ϧ χ Ξ Ừ - - - ✖ sim_A = get_similarity("C", "A") sim_B = get_similarity("C", "B") sim_D = get_similarity("C", "D") sim_E = get_similarity("C", "E") >>> sim_A, sim_B, sim_D, sim_E 1.0 0.0 0.5 0.3333333333333333 sco re C = sco re A * sim A + sco re B * sim B + sco re D * sim D + sco re E * sim E simA + simB + simD + simE = 0 * 1.0 + 1 * 0.0 + 1 * 0.5 + 0 * 0.333 1.0 + 0.0 + 0.5 + 0.333 = 0.2727... recommends = get_recommends("C") >>> recommends [ ('ϋϦϙλ', 0.6666666666666666), ('౷ܭֶೖ໳', 0.27272727272727276) ] 2. ᅂ޷ͷࣅ͍ͯΔϢʔβͷߦಈཤྺ͔Βɼະ஌ͷΞΠςϜʹର͢Δ༧ଌධՁ஋Λܭࢉ͢Δɻ

⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ⭕ ʁ ⭕ Ξ
Π ς Ϝ Ϣʔβ ✖ ✖ ✖ ✖ ✖ ⭕ ⭕ ⭕ ʁ ✖ ⭕ ✖ A B C D E ứ ϋ Ϧ ϙ λ Ừ ứ ޿ ࣙ ԓ Ừ ứ ౷ ܭ ֶ ೖ ໳ Ừ ứ ࢦ ྠ ෺ ޠ Ừ ứ φ ϧ χ Ξ Ừ - - - ✖ recommends = get_recommends("C") >>> recommends [ ('ϋϦϙλ', 0.6666666666666666), ('౷ܭֶೖ໳', 0.27272727272727276) ] 3. ༧ଌධՁ஋ͷߴ͍΋ͷΛϢʔβʹਪન͢Δɻ ਪન!

ਪનγεςϜͱ͸ ਪનख๏ͷ෼ྨ ڠௐϑΟϧλϦϯάͷ࢓૊Έ ·ͱΊ Agenda

·ͱΊ 㾎 ਪનγεςϜͱ͸ʁ w ʮͲΕ͕Ձ஋ͷ͋Δ΋ͷ͔ಛఆ͢ΔͷΛॿ͚Δಓ۩ʯ w ώϡʔϚϯΠϯλϑΣʔεػցֶशɾ౷ܭత༧ଌɾ৘ใݕࡧσʔλϕʔεɾ ฒྻܭࢉɾωοτϫʔΫͳͲͷٕज़ཁૉ͔Β੒Γཱͭ 㾎 ਪનख๏ͷ෼ྨ
w ϙϐϡϥϦςΟ಺༰ϕʔεڠௐϑΟϧλϦϯάϋΠϒϦου 㾎 ڠௐϑΟϧλϦϯάʢϢʔβϕʔεʣͷ࢓૊Έ w ࣅ͍ͯΔϢʔβΛݟ͚ͭΔʂ w ࣅ͍ͯΔϢʔβͷߦಈཤྺʹԠͯ͡ධՁ஋Λ༧ଌͯ͠ਪનʂ

Ref. [Resnick 97] P. Resnick and H. R. Varian. Recommender
systems. Communications of the ACM, Vol. 40, No. 3, pp. 56–58, 1997. [Konstan 03] J. A. Konstan and J. Riedl. Recommender systems: Collaborating in commerce and communities. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, Tutorial, 2003. http://www.kamishima.net/archive/recsysdoc.pdf https://www.slideshare.net/xamat/recsys-2014-tutorial-the-recommender-problem-revisited http://blog.brainpad.co.jp/entry/2017/02/03/153000 https://www.slideshare.net/KentaOku/ss-50762836 https://qiita.com/hik0107/items/96c483afd6fb2f077985

プログラミングLT_2019_はじめての推薦システム.pdf / introducti...

プログラミングLT_2019_はじめての推薦システム.pdf / introduction-to-recommender-system

More Decks by Yuya Matsumura

Other Decks in Technology

Featured

Transcript