Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
59
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
菸酒生在 LINE Taiwan 的後端雙刀流
line_developers_tw
PRO
0
1.1k
讓測試不再 BB! 從 BDD 到 CI/CD, 不靠人力也能 MVP
line_developers_tw
PRO
0
1.1k
DB 醬,嗨!哪泥嘎斯基?
line_developers_tw
PRO
0
1.1k
比起獨自升級 我更喜歡 DevOps 文化 <3
line_developers_tw
PRO
0
1.1k
工具人的一生: 開發很多 AI 工具讓我 慵懶過一生
line_developers_tw
PRO
0
1.1k
從四件事帶你見識見識 事件驅動架構設計 (EDA)
line_developers_tw
PRO
0
1k
TODAY 看世界(?) 是我們在看扣啦!
line_developers_tw
PRO
0
1.1k
你想成為什麼樣的開發者?
line_developers_tw
PRO
0
25
研究生的 LINER生活
line_developers_tw
PRO
0
27
Other Decks in Technology
See All in Technology
PHPでWebブラウザのレンダリングエンジンを実装する
dip_tech
PRO
0
200
20250625 Snowflake Summit 2025活用事例 レポート / Nowcast Snowflake Summit 2025 Case Study Report
kkuv
1
310
【TiDB GAME DAY 2025】Shadowverse: Worlds Beyond にみる TiDB 活用術
cygames
0
1.1k
セキュリティの民主化は何故必要なのか_AWS WAF 運用の 10 の苦悩から学ぶ
yoh
1
150
「Chatwork」の認証基盤の移行とログ活用によるプロダクト改善
kubell_hr
1
160
TechLION vol.41~MySQLユーザ会のほうから来ました / techlion41_mysql
sakaik
0
180
Windows 11 で AWS Documentation MCP Server 接続実践/practical-aws-documentation-mcp-server-connection-on-windows-11
emiki
0
970
Snowflake Summit 2025全体振り返り / Snowflake Summit 2025 Overall Review
mtpooh
2
400
Lambda Web Adapterについて自分なりに理解してみた
smt7174
3
110
How Community Opened Global Doors
hiroramos4
PRO
1
120
AIの最新技術&テーマをつまんで紹介&フリートークするシリーズ #1 量子機械学習の入門
tkhresk
0
140
AWS Summit Japan 2025 Community Stage - App workflow automation by AWS Step Functions
matsuihidetoshi
1
270
Featured
See All Featured
How to Ace a Technical Interview
jacobian
277
23k
Typedesign – Prime Four
hannesfritz
42
2.7k
The Invisible Side of Design
smashingmag
299
51k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
124
52k
For a Future-Friendly Web
brad_frost
179
9.8k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
48
5.4k
Building an army of robots
kneath
306
45k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3.3k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
138
34k
RailsConf 2023
tenderlove
30
1.1k
Done Done
chrislema
184
16k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
35
2.3k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None