Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
66
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
Live Activities in LINE
line_developers_tw
PRO
0
9
Neumorphism x Liquid Glass
line_developers_tw
PRO
0
10
猜你喜歡 – 打造高度擴展的個人化電商推薦
line_developers_tw
PRO
0
22
打造新電商搜尋體驗- 搜尋意圖辨識
line_developers_tw
PRO
0
7
比價群組
line_developers_tw
PRO
0
11
從混亂到優雅,讓專案不再失控:ATDD 與 Clean Architecture 的後端實戰之路
line_developers_tw
PRO
0
10
2049智能共存:透過LINE Bot Agent迎接後人類時代
line_developers_tw
PRO
0
39
菸酒生在 LINE Taiwan 的後端雙刀流
line_developers_tw
PRO
0
1.4k
讓測試不再 BB! 從 BDD 到 CI/CD, 不靠人力也能 MVP
line_developers_tw
PRO
0
1.5k
Other Decks in Technology
See All in Technology
250905 大吉祥寺.pm 2025 前夜祭 「プログラミングに出会って20年、『今』が1番楽しい」
msykd
PRO
1
890
自作JSエンジンに推しプロポーザルを実装したい!
sajikix
1
180
Rustから学ぶ 非同期処理の仕組み
skanehira
1
140
COVESA VSSによる車両データモデルの標準化とAWS IoT FleetWiseの活用
osawa
1
280
Autonomous Database - Dedicated 技術詳細 / adb-d_technical_detail_jp
oracle4engineer
PRO
4
10k
TS-S205_昨年対比2倍以上の機能追加を実現するデータ基盤プロジェクトでのAI活用について
kaz3284
1
130
【初心者向け】ローカルLLMの色々な動かし方まとめ
aratako
7
3.4k
なぜスクラムはこうなったのか?歴史が教えてくれたこと/Shall we explore the roots of Scrum
sanogemaru
5
1.6k
💡Ruby 川辺で灯すPicoRubyからの光
bash0c7
0
110
ZOZOマッチのアーキテクチャと技術構成
zozotech
PRO
4
1.5k
AIエージェント開発用SDKとローカルLLMをLINE Botと組み合わせてみた / LINEを使ったLT大会 #14
you
PRO
0
120
EncryptedSharedPreferences が deprecated になっちゃった!どうしよう! / Oh no! EncryptedSharedPreferences has been deprecated! What should I do?
yanzm
0
320
Featured
See All Featured
Fireside Chat
paigeccino
39
3.6k
What's in a price? How to price your products and services
michaelherold
246
12k
A Tale of Four Properties
chriscoyier
160
23k
The Power of CSS Pseudo Elements
geoffreycrofte
77
6k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.6k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Rails Girls Zürich Keynote
gr2m
95
14k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
112
20k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
Building Better People: How to give real-time feedback that sticks.
wjessup
368
19k
The Invisible Side of Design
smashingmag
301
51k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None