Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Counterfactual learning to rank: introduction
Search
Daiki Tanaka
May 02, 2020
Research
0
670
Counterfactual learning to rank: introduction
一般的なランキング学習からcounterfactual LTRへの導入
Daiki Tanaka
May 02, 2020
Tweet
Share
More Decks by Daiki Tanaka
See All by Daiki Tanaka
カーネル法概観
daikitanak
0
510
カーネル法:正定値カーネルの理論
daikitanak
0
64
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR STRUCTURED DATA
daikitanak
1
180
[Paper Reading] Attention is All You Need
daikitanak
0
110
Interpretability of Machine Learning : Paper reading (LIME)
daikitanak
0
140
[Paper reading] Local Outlier Detection With Interpretation
daikitanak
0
63
Other Decks in Research
See All in Research
数理最適化と機械学習の融合
mickey_kubo
15
8.8k
When Submarine Cables Go Dark: Examining the Web Services Resilience Amid Global Internet Disruptions
irvin
0
200
Ad-DS Paper Circle #1
ykaneko1992
0
5.5k
近似動的計画入門
mickey_kubo
4
970
チャッドローン:LLMによる画像認識を用いた自律型ドローンシステムの開発と実験 / ec75-morisaki
yumulab
1
430
Google Agent Development Kit (ADK) 入門 🚀
mickey_kubo
2
990
研究テーマのデザインと研究遂行の方法論
hisashiishihara
5
1.4k
実行環境に中立なWebAssemblyライブマイグレーション機構/techtalk-2025spring
chikuwait
0
220
言語モデルによるAI創薬の進展 / Advancements in AI-Driven Drug Discovery Using Language Models
tsurubee
2
370
Adaptive Experimental Design for Efficient Average Treatment Effect Estimation and Treatment Choice
masakat0
0
130
CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations
satai
3
210
Mechanistic Interpretability:解釈可能性研究の新たな潮流
koshiro_aoki
1
280
Featured
See All Featured
GitHub's CSS Performance
jonrohan
1031
460k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
700
Rails Girls Zürich Keynote
gr2m
94
14k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
281
13k
Fireside Chat
paigeccino
37
3.5k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
124
52k
The Art of Programming - Codeland 2020
erikaheidi
54
13k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
Raft: Consensus for Rubyists
vanstee
140
7k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.8k
The Pragmatic Product Professional
lauravandoore
35
6.7k
Making the Leap to Tech Lead
cromwellryan
134
9.3k
Transcript
Unbiased Learning to Rank May 7, 2020
Learning to rank ઃఆ Supervised LTR Pointwise loss Pairwise loss
Listtwise loss Counterfactual Learning to Rank Counterfactual Evaluation Inverse Propensity Scoring Propensity-weighted Learning to Rank 2
Learning to rank: ઃఆ ೖྗɿ จॻͷू߹ D ग़ྗɿ จॻͷॱҐ R
= (R1; R2; R3:::) ͨͩ͠ɺ֤จॻʹϞσϧ f„ ʹΑͬͯείΞ͕͍͍ͭͯͯ f„ (R1) – f„ (R2) – f„ (R3) ::: ͱͳ͍ͬͯΔɻ(ߴ͍είΞ͕͚ΒΕΔ΄ͲॱҐ͕ߴ͍) Learning to Rank (LTR) ͷత࠷దͳॱҐΛग़ྗ͢ΔϞσϧ f„ ͷύϥϝʔλ „ Λ σʔλ͔ΒٻΊΔ͜ͱɻ 3
Supervised LTR ڭࢣ͋Γ LTR Ͱɺ › ݕࡧΫΤϦ › จॻू߹ ›
ॱҐͷϥϕϧ ΛؚΉσʔληοτΛͬͯϞσϧύϥϝʔλΛٻΊΔɻ ڭࢣ͋Γ LTR Ͱ༻͍ΒΕΔଛࣦओʹ 3 ͭɿ › Pointwise loss › Pairwise loss › Listwise loss y (d) ʹΑͬͯɺจॻ d ͷݕࡧΫΤϦͷؔ࿈Λද͢ͱ͢Δɻ(େ͖͍΄ͲॱҐͷ্Ґʹ ͖ͯཉ͍͠) 4
Pointwise loss Pointwise loss ɺॱҐͷਪఆΛྨɾճؼͱͯ͠ղ͘ɻྫ͑ɺ௨ৗͷճؼଛࣦ (squared loss) ͱͯ͠ҎԼͷΑ͏ʹ༩͑Δɿ Lpointwise :=
1 N N X i=1 (f„ (di) ` y (di))2 Pointwise loss ͷɺϞσϧͷग़ྗΛॱҐͱͯ͠͏͜ͱΛߟྀʹೖΕ͍ͯͳ͍͜ ͱɻLTR Ͱग़ྗͱͯ͠ಘΒΕΔείΞΛฒͼସ͑ͯಘΒΕΔॱҐʹͷΈؔ৺͕͋Δɻ 5
Pairwise loss Pairwise loss Ͱɺ2 ͭͷจॻؒͷ૬ରతͳείΞͷେখΛߟྀʹ͍ΕΔɻྫ͑ɺҎԼ ͷΑ͏ͳ hinge-loss Λ༩͑Δʀ Lpairwise
:= X y(di)>y(dj) max (0; 1 ` (f„ (di) ` f„ (di))): ॱҐ͕૬ରతʹߴ͍จॻείΞ͕ߴ͘ɺॱҐ͕͍จॻείΞΛ͘͢Δؾ࣋ͪɻ Pairwise loss ͷɺશͯͷهࣄϖΞΛಉ༷ʹѻ͏͜ͱɻ࣮ͦͯ͠༻্ top100 ͱ top10 ޙऀͷํ͕ॏࢹ͞ΕΔ͜ͱɻPairwise loss Ͱ top100 ͷԼͷํͷॱҐΛվળ ͤ͞ΔͨΊʹ্ҐͷॱҐΛ٘ਜ਼ʹ͢Δ͜ͱ͕͋Γ͑ͯ͠·͏ɻ 6
Listwise loss Listwise loss ͰॱҐࢦඪΛ࠷దԽ͢Δɻ՝ɺॱҐࢦඪ͕ඍՄೳͰͳ͍͜ͱɻ ྫ͑ɺDCG ɿ DCG = N
X i=1 y (di) log2 (rank (di) + 1) Ͱ͋Δ͕ɺlog2 (rank (di) + 1) ඍෆՄೳͰ͋Δɻ ͦͷͨΊʹ֬తۙࣅΛ༻͍Δํ๏ (ListNetɺListMLE) ɺώϡʔϦεςΟοΫॱҐ ࢦඪͷόϯυΛ࠷దԽ͢Δख๏͕͋Δɻ(LambdaRankɺLambdaLoss) ྫ͑ɺ LambdaRank ͷଛࣦ DCG ͷόϯυͱͳ͍ͬͯΔɿ LLambdaRank := X y(di)>y(dj) log (1 + exp (f„ (dj) ` f„ (di))) j´DCGj 7
ҼՌධՁ తɿ৽͍͠ϥϯΩϯάؔ f„ ΛɺผͷϥϯΩϯάؔ fdeploy ͷԼͰूΊΒΕͨաڈ ͷσʔλ (ΫϦοΫσʔλͳͲ) ΛͬͯධՁ͍ͨ͠ɻ ҎԼͷ
2 ͭͷ߹ʹ͍ͭͯߟ͑Δɻ › શͯͷจॻʹ͍ͭͯਅͷؔ࿈ y (di) ͕طͰ͋Δ࣌ › y (di) Θ͔Βͳ͍͕ɺΫϦοΫใͳͲͷ҉తͳϑΟʔυόοΫͷΈར༻Մೳͳ࣌ 8
ҼՌධՁɿϥϕϧ͕طͳΒશʹධՁ͕Ͱ͖Δ શͯͷจॻʹ͍ͭͯਅͷϥϕϧ y (di) ͕طͰ͋Δ࣌ɺIR(ใݕࡧ) ࢦඪΛܭࢉͰ͖Δɿ ´ (f„; D; y)
= X di2D – (rank (di j f„; D)) ´ y (di) ͜͜Ͱɺ– ॱҐॏΈ͚ؔͰ͋ͬͯɺྫ͑ɿ APR: – (r) = r DCG: – (r) = 1 log2 (1+r) ͳͲ͕༻͍ΒΕΔɻ 9
ҼՌධՁ y (di) Θ͔Βͳ͍͕ɺΫϦοΫใͳͲͷ҉తͳϑΟʔυόοΫͷΈར༻Մೳͳ࣌ɿ › ͋Δจॻʹର͢ΔΫϦοΫɺͦͷจॻ͕ؔ࿈͍ͯ͠Δ͜ͱΛࣔ͢ɺόΠΞεɾϊΠζ ͖ͭͷࢦඪʹͳ͍ͬͯΔɻ › ΫϦοΫ͞Εͳ͔͔ͬͨΒͱ͍ͬͯͦͷจॻ͕ؔͳ͍Θ͚Ͱͳ͍ɻ(จॻ͕ؔͳ ͍ɾϢʔβ͕จॻΛ؍ଌ͍ͯ͠ͳ͍ɾϥϯμϜཁૉʹΑΔͷ)
ଟ͘ͷ؍ଌσʔλʹ͍ͭͯฏۉΛऔΕϊΠζআڈͰ͖Δͱߟ͑ΒΕΔ͕ɺόΠΞεআ ڈͰ͖ͳ͍ɻ 10
ҼՌධՁɿ؍ଌɾΫϦοΫϞσϧ Ϣʔβͷ؍ଌٴͼจॻͷؔ࿈ͷΈΛߟྀʹೖΕΔͱɺϢʔβͷΫϦοΫҎԼͷΑ͏ʹϞ σϦϯάͰ͖ͦ͏ɿ › ϥϯΩϯά R ʹ͓͍ͯจॻ di ͕؍ଌ͞ΕΔ (oi
= 1 Ͱද͢) ֬ɺ P (oi = 1 j R; di) (؍ଌ͞ΕΔ֬ؔ࿈ʹؔͳ͍ͱԾఆ͍ͯ͠Δɻ) › ؔ࿈ y (di) ͱ؍ଌ oi ͕༩͑ΒΕͨ࣌ͷɺจॻ di ͕ΫϦοΫ͞ΕΔ֬ (ci = 1 Ͱද͢) ɺ P (ci = 1 j oi; y (di)) › ΫϦοΫ؍ଌ͞Εͨจॻʹ͔͠ى͜Βͳ͍ͨΊɺϥϯΩϯά R ʹ͓͍ͯΫϦοΫ͞ ΕΔ֬ɿ P (ci = 1 ^ oi = 1 j y (di) ; R) = P (ci = 1 j oi = 1; y (di)) ´ P (oi = 1 j R; di) 11
ҼՌධՁɿ´ (f„; D; y) ͷφΠʔϒਪఆ ´ (f„; D; y) ΛφΠʔϒʹਪఆ͢ΔʹɺΫϦοΫͷใ
(ci) Λਅͷؔ࿈ϥϕϧ (y (di)) ͷΘΓʹ͑Αͯ͘ɺ ´NAIVE (f„; D; c) := X di2D – (rank (di j f„; D)) ´ ci ͱͳΔɻ ΫϦοΫʹϊΠζ͕͍ͬͯͳ͍࣌ɺͭ·Γ P (ci = 1 j oi = 1; y (di)) = y (di) Ͱ͋Δ࣌Ͱ͑͞ɺφΠʔϒਪఆ؍ଌόΠΞεΛड͚͍ͯΔɿ Eo ˆ´NAIVE (f„; D; c)˜ = Eo 2 4 X di2D – (rank (di j f„; D)) ´ ci 3 5 = Eo 2 6 4 X di:oi=1^y(di)=1 – (rank (di j f„; D)) 3 7 5 = X di:y(di)=1 P (oi = 1 j R; di)– (rank (di j f„; D)) = X di2D P (oi = 1 j R; di)– (rank (di j f„; D)) ´ y (di) 12
ҼՌධՁɿ´ (f„; D; y) ͷφΠʔϒਪఆ φΠʔϒਪఆɿ Eo ˆ´NAIVE (f„; D;
c)˜ = X di:y(di)=1 P (oi = 1 j R; di)– (rank (di j f„; D)) ͰɺͦΕͧΕͷจॻͷɺϩάऩू࣌ͷϥϯΩϯά R Ͱͷ؍ଌ֬ͰॏΈͨ͠ਪఆʹͳͬ ͯ͠·͏ɻ ϥϯΩϯάͰɺߴॱҐͷจॻ΄Ͳ؍ଌ͞Ε͍͢ɿ͜ΕΛ position bias ͱݺͿɻϩάऩ ूͷࡍʹߴॱҐʹදࣔ͞Εͨจॻਅͷؔ࿈ΑΓؔ࿈͕͋ΔɺͱόΠΞεΛड͚ͯ͠· ͏ɻ όΠΞεΛআڈ͢ΔͨΊʹɺP (oi = 1 j R; di) Λਪఆ͠ɺิਖ਼ͯ͋͛͠Εྑͦ͞͏ ! είΞʹΑΔόΠΞεআڈ 13
είΞΛ༻͍ͨόΠΞεআڈ Inverse Propensity Scoring(IPS) ʹΑͬͯόΠΞεΛআڈ͢Δɿ ´IPS (f„; D; c) :=
X di2D – (rank (di j f„; D)) P (oi = 1 j R; di) ´ ci ͜͜ͰɺP (oi = 1 j R; di) ϩάऩूதʹදࣔ͞ΕͨϥϯΩϯά R Ͱจॻ di ͕؍ଌ͞ ΕΔ֬Ͱ͋Δɻ´IPS (f„; D; c) ΫϦοΫϊΠζ͕ͳ͍߹ɺͭ·Γ P (ci = 1 j oi = 1; y (di)) = y (di) Ͱ͋Δ࣌ʹ ´ (f„; D; y) ͷෆภਪఆྔͰ͋Δɿ Eo ˆ´IPS (f„; D; c)˜ = Eo 2 4 X di2D – (rank (di j f„; D)) P (oi = 1 j R; di) ´ ci 3 5 = Eo 2 6 4 X di:oi=1^y(di)=1 – (rank (di j f„; D)) P (oi = 1 j R; di) 3 7 5 = X di:y(di)=1 P (oi = 1 j R; di) ´ – (rank (di j f„; D)) P (oi = 1 j R; di) = X di2D – (rank (di j f„; D)) ´ y (di) = ´ (f„; D; y) : 14
Propensity-weighted LTR IPS ´ (f„; D; y) ͷෆภਪఆͰ͋ͬͨɻΑͬͯɺ࠷దͳϞσϧύϥϝʔλ „
IPS Λ ࠷దԽ͢Δ͜ͱͰٻΊΔ͜ͱ͕Ͱ͖ΔɻIPS Λ࠷దԽ͢ΔࡍɺϥϯΩϯάࢦඪ – (r) ͷඍ ෆՄೳੑʹରॲ͢ΔͨΊɺ– (r) ͷ bound Λར༻͢Δɻ Propensity-weighted LTR ͷྲྀΕɿ › ΫϦοΫͷείΞΛਪఆɿ P (oi = 1 j R; di) › ෆภਪఆྔ ´IPS (f„; D; c) ͷ bound ʹ͍ͭͯඍΛܭࢉɿ „0 = r„ "– (rank (di j f„; D)) P (oi = 1 j R; di) # › ϞσϧύϥϝʔλΛߋ৽ „new „old ` „0 15
References › https://ilps.github.io/webconf2020-tutorial-unbiased-ltr/ 16