Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Offline A/B testing for Recommender Systems
Search
alpicola
November 20, 2018
Technology
2.2k
0
Share
Offline A/B testing for Recommender Systems
alpicola
November 20, 2018
More Decks by alpicola
See All by alpicola
[AEON TECH HUB #24] お客様の長期的興味の理解に向けて
alpicola
0
160
商品レコメンドでのexplicit negative feedbackの活用
alpicola
2
970
Recommending What Video to Watch Next: A Multitask Ranking System
alpicola
1
950
Kibanaを用いたアクセスログ調査と解析 / Access Log Analysis Using Kibana
alpicola
0
1k
Other Decks in Technology
See All in Technology
ハーネスエンジニアリングの概要と設計思想
sergicalsix
9
5.5k
No Types Needed, Just Callable Method Check
dak2
1
2.1k
Pure Intonation on Browser: Building a Sequencer with Ruby
nagachika
0
170
Standards et agents IA : un tour d’horizon de MCP, A2A, ADK et plus encore
glaforge
0
210
AIが盛んな時代に 技術記事を書き始めて起きた私の中での小さな変化
peintangos
0
250
Microsoft 365 / Microsoft 365 Copilot : 自分の状態を確認する「ラベル」について
taichinakamura
0
370
巨大プラットフォームを進化させる「第3のROI」
recruitengineers
PRO
2
1.4k
ネットワーク運用を楽にするAWS DevOps Agent活用法!! / 20260421 Masaki Okuda
shift_evolve
PRO
2
240
Hacobu Tech Deck
hacobu
PRO
0
130
エージェントスキルを作って自分のインプットに役立てよう
tsubakimoto_s
0
460
AIコーディング時代における、ソフトウェアサプライチェーン攻撃に対する防衛術(簡易版)
soysoysoyb
0
150
Route 53 Global Resolver で高額課金発生!
otanikohei2023
0
130
Featured
See All Featured
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Become a Pro
speakerdeck
PRO
31
5.9k
Tell your own story through comics
letsgokoyo
1
900
The SEO identity crisis: Don't let AI make you average
varn
0
450
Facilitating Awesome Meetings
lara
57
6.8k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
2
190
The Pragmatic Product Professional
lauravandoore
37
7.2k
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
110
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
1
1.2k
How GitHub (no longer) Works
holman
316
150k
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
Transcript
Offline A/B testing for Recommender Systems ͯͳ ాத (alpicola) @
จಡΈձ 11/19 1
Offline A/B testing for Recommender Systems — CriteoͷWSDM'18ͷจ — SpotifyͷRecSys'18จͰݴٴ
2
Offline A/B testing for Recommender Systems — CriteoͷWSDM'18ͷจ — SpotifyͷRecSys'18จͰݴٴ
— ΫοΫύου։࠵ͷಡΈձͰ͢Ͱʹհ͞Ε͍ͯͨ — ͕ɺվΊͯ۷ΓԼ͕͛ͨͰ͖Εͱࢥ͍·͢ 3
ΦϑϥΠϯABςετ? — ΦϯϥΠϯͰߦ͏ABςετ࣌ؒͱ͕͔͔ۚΔ — ΦϑϥΠϯͰͦΕʹ͍ۙධՁ͕ߦ͑ΕΞϧΰϦζ ϜվળͷαΠΫϧΛߴԽͰ͖Δ — Ͱਫ਼? ! 4
ϩάʹجͮ͘ΦϑϥΠϯධՁͷݚڀ — Counterfactual estimationͱ͔off-policy estimationͱ ݺΕΔ — WSDM'15ͷνϡʔτϦΞϧ — SIGIR'16ͷνϡʔτϦΞϧ
— ධՁ͚ͩͰͳֶ͘शͷతؔʹ͏͜ͱͰ͖Δ — ͜ͷจͰධՁͷΈΛѻ͏ 5
จͷߩݙ — ΦϑϥΠϯABςετͰ༻͍Δใुͷਪఆख๏NCISͷ ͋Δछͷ࠷దੑΛࣔ͢ — ͜ͷݟʹج͍ͮͯNCISͷ֦ுPieceNCISͱ PointNCISΛఏҊ — ΦϯϥΠϯABςετ݁Ռͱͷ૬͕ؔେ্͖͘ 6
ઃఆ — Top-k ϥϯΩϯά — : ϩά — : ίϯςΩετ
— : ΞΫγϣϯ — : ใु 7
ઃఆ — : ίϯςΩετ͔ΒΞΫγϣϯΛબͿϙϦγʔ — : ݱߦͷϙϦγʔ — : ςετ͍ͨ͠ϙϦγʔ
— : ฏۉॲஔޮՌ — ͜ΕΛਪఆ͍ͨ͠ 8
ઃఆ — ΦϯϥΠϯABςετ — ͷݩͰͷϩάͱ ͷݩͰͷϩά͕͋Δ — ඪຊฏۉͰ , ͦΕͧΕਪఆ
— ΦϑϥΠϯABςετ — ͷݩͰͷϩά͔Β ਪఆ ! 9
ैདྷख๏ — Importance sampling (IS) — Normalized importance sampling (NIS)
— Doubly robust estimator (DR) — Capped importance sampling (CIS) — Normalized capped importance sampling (NCIS) ౷ܭϞϯςΧϧϩ๏ͷจ຺Ͱొ 10
Importance sampling (IS) — ! όΠΞε͕ͳ͍ — — " ʹΑΔߴόϦΞϯε
(unbounded) — όϦΞϯε͕େ͖͍ͱ ͱ ΛൺֱͰ͖ͳ͍ 11
Normalized importance sampling (NIS) Λͬͯ Λஔ͖͑ — ! ҰகਪఆྔʹͳΔ —
— " ґવͱͯ͠όϦΞϯεେ 12
Capped importance sampling (CIS) ॏΈͷ࠷େΛ ʹ (max capping) ॏΈ͕ Ҏ্ͷ߲ࣺͯΔ
(zero capping) 13
CISͷόΠΞε 14
CISͷόΠΞε — όΠΞε ͷ࣌ͷ Ͱbound͞ΕΔ — — ใु͕େ͖͍ͱ͜ΖΛऔΕΔΑ͏ʹվળ͍ͨ͠ ͕ͦ͏͢ΔͱόΠΞε͕େ͖͘ͳΔ !
15
CISͷόΠΞε Cappingͷઃఆʹ͍͍τϨʔυΦϑ͕ଘࡏ͠ͳ͍ ! 16
Normalized capped importance sampling (NCIS) NIS, CIS྆ํͷΞΠσΞΛ࣋ͪࠐΉ 17
NCISͱCISͷؔ 18
NCISͱCISͷؔ CIS͕͍࣋ͬͯͨόΠΞε Λୈೋ߲ͰϞσϧ ͍ͯ͠ΔͱݟͳͤΔ 19
NCISͱCISͷؔ (ಛʹzero cappingͷ࣌) 20
NCISͱCISͷؔ (ಛʹzero cappingͷ࣌) — ͳΒۙతʹόΠΞ ε͕ͳ͘ͳΔ ! — ͷ ,
ʹର͢Δґଘ͕খ͍࣌͞ͳͲ 21
NCISͷόΠΞε 22
NCISͷόΠΞε — ͱcappingͷ༗ແʹ૬͕ؔ͋ΔͱόΠΞε͕େ͖͘ ͳΔ ! — ަབྷҼࢠϢʔβʔͷλΠϓͳͲ͕ߟ͑ΒΕΔ (Table 1) 23
NCISͷόΠΞε 24
จͷΞΠσΞ — ͷϞσϦϯάΛάϩʔόϧ㱺ϩʔΧϧʹ — ίϯςΩετ ʹରͯ͠ہॴతͳNCIS — ͱcappingͷ૬ؔΛݮΒ͢ — Piecewise
NCIS: ׂ͞ΕͨྖҬ͝ͱʹNCIS — Pointwise NCIS: ཁૉ͝ͱʹNCIS 25
Piecewise NCIS (PieceNCIS) ίϯςΩετͷू߹ ͷׂ Λߟ͑Δ 26
Piecewise NCIS (PieceNCIS) ׂ֤ʹରͯ͠NCIS 27
ׂͷྫ దͳؔ ΛఆΊͯ ֤ Ͱ ͷ ʹର͢Δґଘ͕খ͘͞ͳΔΑ͏ʹ 28
Pointwise NCIS (PointNCIS) ཁૉ୯ҐͰׂ͢Δ (i.e. ) ಛఆͷίϯςΩετʹର͢Δαϯϓϧ͘͝গͳ͍ͷ ͰૉʹNCISΛద༻Ͱ͖ͳ͍ 29
Pointwise NCIS (PointNCIS) — ΞΫγϣϯʹ͍ͭͯपลԽ͢Δ ͱਖ਼֬ʹٻΊΒΕΔ — ΞΫγϣϯͷ͕ଟ͍ͱܭࢉ͕ߴίετ ! —
ΛαϯϓϦϯάͰٻΊΔ 30
Midzuno-Sen method 1. Λαϯϓϧ 2. Λ ͔Β ͳͷ͕ಘΒΕΔ·Ͱαϯϓϧ 3. Λ
͔Βαϯϓϧ 4. Λฦ͢ ͜͏ͯ͠ಘΒΕΔΛ ͱॻ͘ 31
Pointwise NCIS (PointNCIS) — ͷ͏ͪ ͕ ͷσʔλແࢹͰ͖Δ — ใु͕εύʔεͳ࣌ʹޮతʹܭࢉͰ͖Δ !
32
࣮ݧ — ϓϩϓϥΠΤλϦͷσʔληοτ — 39छɺ߹ܭͰઍԯ݅ͷϩάσʔλ — ΫϦοΫϕʔεͷใु (εύʔε͔ͭࢄେ) — ରCIS,
NCIS, PieceNCIS, PointNCIS ( ) — IS, NISόϦΞϯε͕ߴ͗͢ΔͷͰআ֎ 33
ΦϯϥΠϯʗΦϑϥΠϯABςετͷ૬ؔ 34
ద߹ͱِӄੑ ʮ ͕ ΑΓΑ͍͔Ͳ͏͔ʯͷ2༧ଌͱͯ͠ݟΔ 35
࣮ݧ݁Ռͷ·ͱΊ — CIS૬͕ؔෛ — શମతʹΊͷਪఆ͕ग़͍ͯͨ (Figure 4) — CIS⇒NCISͰେ͖͘վળ —
NCIS⇒PointNCISͰِཅੑ͕͞ΒʹԼ͕Δ — ద߹NCISҎޙͦ͜·ͰΑ͘ͳΒͳ͍ — ࣮ߦʹ͓͍ͯਫ਼ʹ͓͍ͯPointNCIS͕Α͍ 36
Appendix — ͕খ͍͞ͱ ͕ cappingΛ͑Δ͜ͱ — Max cappingͰ ʹͳΔΑ͏ͳ ৽͍͠capping
͕ͱΕΔ (Lemma A.3) 37