Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
74
0
Share
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
2026.04.09_台灣客服協會_從資料重新理解客服_ Charlie Wang
line_developers_tw
PRO
0
22
Zona 台北大學 GDG 分享
line_developers_tw
PRO
0
29k
台大資料分析與決策社 機器學習的商業應用_Rei
line_developers_tw
PRO
0
17
政大數據分析社 機器學習的商業應用_Rei
line_developers_tw
PRO
0
31
Gemini 2025 新功能回顧 LINE Bot 完美結合
line_developers_tw
PRO
0
510
NTUAI企業參訪
line_developers_tw
PRO
0
14k
Data TECH FRESH企業參訪- Amber
line_developers_tw
PRO
0
30k
Data Team 實習分享
line_developers_tw
PRO
0
25k
Backend Intern之旅
line_developers_tw
PRO
0
35k
Other Decks in Technology
See All in Technology
組織的なAI活用を阻む 最大のハードルは コンテキストデザインだった
ixbox
6
1.6k
Introduction to Sansan Meishi Maker Development Engineer
sansan33
PRO
0
380
シン・リスコフの置換原則 〜現代風に考えるSOLIDの原則〜
jinwatanabe
0
180
さくらのクラウドでつくるCloudNative Daysのオブザーバビリティ基盤
b1gb4by
0
150
20260410 - CNTUG meetup #72 - DiskImage Builder 介紹:以 Kubespray CI 打造 RockyLinux 10 Cloud Image 為例
tico88612
0
120
AI環境整備はどのくらい開発生産性を変えうるか? #AI駆動開発 #AI自走環境
ucchi0909
0
120
3つのボトルネックを解消し、リリースエンジニアリングを再定義した話
nealle
0
370
新メンバーのために、シニアエンジニアが環境を作る時代
puku0x
0
650
【PHPカンファレンス小田原2026】Webアプリケーションエンジニアにも知ってほしい オブザーバビリティ の本質
fendo181
0
560
AIエージェントを構築して感じた、AI時代のCDKとの向き合い方
smt7174
1
160
終盤で崩壊させないAI駆動開発
j5ik2o
0
470
DevOpsDays Tokyo 2026 見えない開発現場を、見える投資に変える
rojoudotcom
2
160
Featured
See All Featured
WENDY [Excerpt]
tessaabrams
9
37k
Utilizing Notion as your number one productivity tool
mfonobong
4
290
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.8k
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
220
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
1k
Everyday Curiosity
cassininazir
0
190
Java REST API Framework Comparison - PWX 2021
mraible
34
9.3k
SEO for Brand Visibility & Recognition
aleyda
0
4.4k
A Tale of Four Properties
chriscoyier
163
24k
The Curse of the Amulet
leimatthew05
1
11k
HDC tutorial
michielstock
1
610
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
120
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None