Slide 1

Slide 1 text

CFMLͷ֓ཁͱݚڀಈ޲ Kazuki Taniguchi CFMLษڧձ#1

Slide 2

Slide 2 text

• ৬ྺ • 2014.4-2019.3 • גࣜձࣾαΠόʔΤʔδΣϯτ ΞυςΫຊ෦ AI Lab • 2019.4- • ౎಺ͷελʔτΞοϓ • ϑϦʔϥϯε(AI/MLͷݚڀ։ൃ) • ݚڀ෼໺ • Pattern Recognition / Image Super Resolution • Recommendation / Response Prediction • Counterfactual ML ࣗݾ঺հ ୩ޱ ࿨ً (@kazk1018)

Slide 3

Slide 3 text

Summary • Counterfactual Machine Learningͷ֓ཁ • ൓ࣄ࣮͕ੜ͡ΔσʔλΛ༻͍ͨػցֶशͰ͋Δ • Interactive LearningͱCausal Inference͕ڞมྔγϑτͷ෦ ෼໰୊Ͱ͋Δ͜ͱΛઆ໌͢Δ • ݚڀಈ޲ • ޿ࠂͷίϯϖʹΑͬͯσʔληοτ͕ެ։͞Ε͍ͯΔ • ֶ֤ձͷWorkshopΛ঺հ͢Δ

Slide 4

Slide 4 text

Introduction

Slide 5

Slide 5 text

໰୊ઃఆ ঎඼" ঎඼# ঎඼$ ෳ਺ͷީิͷத͔Β୯ҰͷΞΠςϜΛϢʔβʹਪન͠ɺ Ϣʔβ͔ΒFeedbackΛಘΔ໰୊Λߟ͑Δ ঎඼" બ୒ ΠΠω!!

Slide 6

Slide 6 text

໰୊ઃఆ ͋ΔϢʔβ͕ਪનͨ͠঎඼ΛΫϦοΫ(Feedback)͔ͨ͠ Ͳ͏͔ͷϩάΛऔಘ͍ͯ͠Δ ࣌ࠁ ঎඼ Ϣʔβ ΫϦοΫ  " 9 /P  # : /P  $ 9 :FT  # ; /P  $ ; /P աڈϩά

Slide 7

Slide 7 text

໰୊ઃఆ Ϣʔβͷ৘ใ͔Β঎඼Λܾఆ͢Δํࡦ(Policy)ʹ͍ͭͯ
 ҎԼͷํ๏Λߟ͑Δ • طଘͷPolicyΛ ͱͨ͠ͱ͖ʹ৽͍͠Policy ΛධՁ͍ͨ͠ (Evaluation) • طଘͷϩάΛར༻ͯ͠৽͍͠ Λֶश͍ͨ͠ (Learning) π0 π π

Slide 8

Slide 8 text

Online Evaluation Randomized Controlled Experiment (A/B Testing) • ਖ਼֬ͳൺֱΛߦ͏͜ͱ͕Մೳ (Gold Standard) • ݁Ռ͕ग़Δ·Ͱʹ͕͔͔࣌ؒΔ • ৽͍͠ํࡦͷόά΍UX௿ԼͷϦεΫ͕͋Δ • ຊ൪ಋೖͷ։ൃίετ͕େ͖͍ 50% 50% π π0

Slide 9

Slide 9 text

Offline Evaluation • طଘͷϩάσʔλ͔Β৽͍͠PolicyΛධՁ͢Δ • ࣮ࡍʹ͸ݟ͍ͤͯͳ͍঎඼Λબ୒͢Δ৔߹͸ධՁ͕Ͱ ͖ͳ͍ ࣌ࠁ ঎඼ Ϣʔβ ΫϦοΫ  " 9 /P → BΛਪનͨ͠৔߹͸ʁ π(x) ΫϦοΫ͞ΕΔ͔Ͳ͏͔͸Θ͔Βͳ͍ (Counterfactual)

Slide 10

Slide 10 text

Learning • Learning͸ධՁ஋͕࠷େͱͳΔํࡦΛݟ͚ͭΔ • ධՁ͸ઌड़ͨ͠௨Γਖ਼͘͠ߦ͏͜ͱ͕Ͱ͖ͳ͍

Slide 11

Slide 11 text

Counterfactual Machine Learning ൓ࣄ࣮͕ੜ͡ΔσʔλΛ༻͍ͨػցֶश Causal Inference Interactive Learning Counterfactual Machine Learning ※͜ͷൃදͰ͸ڞมྔγϑτͷ؍఺͔Βݟ͍ͯ͘ (From A Machine Learning Perspective)

Slide 12

Slide 12 text

Related Works

Slide 13

Slide 13 text

ڞมྔγϑτ • ҎԼͷ৚݅ͷ໰୊Λѻ͏ p(x) ≠ p′(x) p(y|x) = p′(y|x) ͸ ʹಠཱʹै͍, ͸ ʹ ಠཱʹै͏ͱԾఆ͢Δ D = {(xi , yi )}n i=1 p(x, y) D′ = {x′ i }m i=1 ∫ p′(x, y)dy ͜ͷͱ͖, ࣍ͷΑ͏ͳ৚݅Λຬͨ͢ͱ͖Λڞมྔγϑτͱ͍͏

Slide 14

Slide 14 text

ڞมྔγϑτ • ྫ) Ի੠ೝࣝ, ը૾ೝࣝ, etc…
 ɹ : Ի੠σʔλ, : ࿩ऀ, : ࢠڙ, : େਓ
 ɹ : ը૾σʔλ, : ਓ෺, : ࣨ಺, : ԰֎ x y p(x) p′(x) x y p(x) p′(x) p(x, y) = p(y|x)p(x) p′(x, y) = p(y|x)p′(x) x y y′ x′

Slide 15

Slide 15 text

ڞมྔγϑτԼͷ༧ଌϞσϧ • ͱ Λ༻͍ͯ৽ͨͳೖྗ ʹର͢Δऔಘ Λ ༧ଌ͢ΔϞσϧ Λֶश͍ͨ͠ • ଛࣦؔ਺Λ ͱ͢Δͱ͖ڞมྔγϑτ͸ॏཁ౓ॏΈ෇ ͖Λ༻͍Δ͜ͱͰղܾ͢Δ͜ͱ͕஌ΒΕ͍ͯΔ {(xi , yi )}n i=0 {x′}m i=0 x′ y fθ (x) loss(y, fθ ) minθ n ∑ i=0 w(xi )loss(yi , fθ (xi ))) w(xi ) = p′(xi ) p(xi )

Slide 16

Slide 16 text

ڞมྔγϑτԼͷ༧ଌϞσϧ • ূ໌ Ep′[loss(y, fθ (x))] = ∫ ∫ loss(y, fθ (x))p′(x, y)dxdy = ∫ ∫ loss(y, fθ (x))p′(x)p′(y|x)dxdy = ∫ ∫ loss(y, fθ (x))p′(x)p′(y|x) p(x) p(x) dxdy = ∫ ∫ loss(y, fθ (x))p(x)p(y|x) p′(x) p(x) dxdy = ∫ ∫ loss(y, fθ (x))p(x, y)w(x)dxdy ≈ n ∑ i=0 loss(y, fθ (x))w(x)

Slide 17

Slide 17 text

Interactive Learning Context: x Policy: π(x) Action: a = π(x) Reward: δ(x, a) System a User $POUFYU "DUJPO 3FXBSE 5JNF 6TFS9 " /P 5JNF 6TFS: # /P 5JNF 6TFS9 $ :FT 5JNF 6TFS; # /P 5JNF 6TFS: $ /P Logging ᶃ ᶄ ᶅ ᶆ x ∼ P(x)

Slide 18

Slide 18 text

Interactive Learning 3FDPNNFOEBUJPO $POUFYUVBM CBOEJU 3FJOGPSDFNFOU -FBSOJOH $POUFYU 6TFSBOE*UFN *OGPSNBUJPO $POUFYU 4UBUF "DUJPO *UFN *% "SN "DUJPO 3FXBSE $MJDL 1VSDIBTF 3FXBSE 3FXBSE (ৄ͘͠͸੪౻ͷൃදͰ)

Slide 19

Slide 19 text

Causal Inference • ͋ΔༀΛױऀʹ౤༩͢Δ͔Ͳ͏͔Ͱ3೔ޙʹපؾ͕࣏ͬ ͍ͯΔ͔Ͳ͏͔ͷҼՌޮՌΛଌΓ͍ͨ ☓ ױऀ: x Treatment: t පؾ͕࣏Δ͔Ͳ͏͔: y ༀΛ౤༩͍ͯ͠ͳ͍৔߹ ༀΛ౤༩ͨ͠৔߹

Slide 20

Slide 20 text

Causal Inference • ࣮ࡍʹ؍ଌ͢Δͷ͸ͲͪΒ͔Ұํ͚ͩͰ͋Δ ☓ ױऀ: x Treatment: t පؾ͕࣏Δ͔Ͳ͏͔: y ҼՌਪ࿦ͷࠜຊ໰୊ ൓ࣄ࣮ (ৄ͘͠͸҆ҪͷൃදͰ) ༀΛ౤༩͍ͯ͠ͳ͍৔߹ ༀΛ౤༩ͨ͠৔߹

Slide 21

Slide 21 text

ڞมྔγϑτͱͷؔ܎ • Interactive Learning • ͷੜ੒աఔʹ؍ଌ͕͋Δ͔Ͳ͏͔ͷҧ͍ • ϢʔβͷinteractionʹΑΔ؍ଌͷ༗ແͰੜ·ΕΔγϑτ D p(x) = q(o|x)p′(x) w(xi ) = 1 q(o|xi )

Slide 22

Slide 22 text

ڞมྔγϑτͱͷؔ܎ • Causal Inference • ͷੜ੒աఔʹtreatment͕͋Δ͔Ͳ͏͔ͷҧ͍ • ዞҙతͳtreatmentͷׂ౰ʹΑͬͯੜ·ΕΔγϑτ D p(x) = q(t|x)P(x) p′(x) = q(¬t|x)P(x) w(x) = q(¬t|x) q(t|x) (=Propensity Score)

Slide 23

Slide 23 text

ݚڀಈ޲

Slide 24

Slide 24 text

• CounterfactualΛѻͬͨػցֶशͷCompetition • ੈք࠷େͷDSPͰ͋ΔCriteo͕σʔληοτΛެ։ • ޿ࠂͷ഑ஔͷ༧ଌͰΫϦοΫͷ࠷େԽ • CounterfactualΛߟྀͨ͠ධՁࢦඪΛಋೖ Criteo Ad Placement Challenge https://www.crowdai.org/challenges/nips-17-workshop-criteo-ad-placement-challenge

Slide 25

Slide 25 text

Workshops • KDD • Causal Discovery, 2018/2019 • Offline and Online Evaluation of Interactive Systems, 2019 • NIPS • From ‘What if?’ To ‘What next?’, 2017 • Causal Learning, 2018 • RecSys • REVEAL, 2018/2019 • ICML • FAIM’18 Workshop(CausalML)

Slide 26

Slide 26 text

Summary

Slide 27

Slide 27 text

Summary • Counterfactual Machine Learningͷ֓ཁ • ൓ࣄ࣮͕ੜ͡ΔσʔλΛ༻͍ͨػցֶशͰ͋Δ • Interactive LearningͱCausal Inference͕ڞมྔγϑτͷ෦ ෼໰୊Ͱ͋Δ͜ͱΛઆ໌ͨ͠ • ݚڀಈ޲ • ޿ࠂͷίϯϖʹΑͬͯσʔληοτ͕ެ։͞Ε͍ͯΔ • ֶ֤ձͷWorkshopΛ঺հ͢Δ

Slide 28

Slide 28 text

References 1. Home Page of Thorsten Joachims
 http://www.cs.cornell.edu/people/tj/
 2. Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising, Leon Bottou, et al…, JMLR 2013
 https://www.microsoft.com/en-us/research/wp-content/uploads/2013/11/ bottou13a.pdf
 3. ඇఆৗ؀ڥԼͰͷֶशɿڞมྔγϑτదԠɼ ΫϥεόϥϯεมԽదԠɼมԽݕ஌
 http://www.ms.k.u-tokyo.ac.jp/2014/NonstationarityReview-jp.pdf
 4. Offline Evaluation and Optimization for Interactive Systems
 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ tutorial.pdf Πϥετ by ͔Θ͍͍ϑϦʔૉࡐू ͍Β͢ͱ΍