Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
77
0
Share
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
20260514 - build with ai 2026 - build LINE Bot with Gemini CLI
line_developers_tw
PRO
0
390
2026.04.09_台灣客服協會_從資料重新理解客服_ Charlie Wang
line_developers_tw
PRO
0
42
Zona 台北大學 GDG 分享
line_developers_tw
PRO
0
57k
台大資料分析與決策社 機器學習的商業應用_Rei
line_developers_tw
PRO
0
29
政大數據分析社 機器學習的商業應用_Rei
line_developers_tw
PRO
0
54
Gemini 2025 新功能回顧 LINE Bot 完美結合
line_developers_tw
PRO
0
600
NTUAI企業參訪
line_developers_tw
PRO
0
20k
Data TECH FRESH企業參訪- Amber
line_developers_tw
PRO
0
43k
Data Team 實習分享
line_developers_tw
PRO
0
35k
Other Decks in Technology
See All in Technology
SLI/SLO、「完全に理解した」から「チョットデキル」へ
maruloop
5
430
生成AI時代に信頼性をどう保ち続けるか - Policy as Code の実践
akitok_
1
220
Vision Banana: Image Generators are Generalist Vision Learners
kzykmyzw
0
360
クラウドネイティブ DB はいかにして制約を 克服したか? 〜進化歴史から紐解く、スケーラブルアーキテクチャ設計指針〜
hacomono
PRO
6
920
データモデリング通り #5オンライン勉強会: AIに『ビジネスの文脈』を教え込むデータモデリング
datayokocho
0
260
Sociotechnical Architecture Reviews: Understanding Teams, not just Artefacts
ewolff
1
170
[Scram Fest Niigata2026]Quality as Code〜AIにQAの思考を再現させる試み〜
masamiyajiri
1
320
多角的な視点から見たAGI
terisuke
0
130
「強制アップデート」か「チームの自律」か?エンタープライズが辿り着いたプラットフォームのハイブリッド運用/cloudnative-kaigi-hybrid-platform-operations
mhrtech
0
180
2026-05-14 要件定義からソース管理まで!IBM Bob基礎ハンズオン
yutanonaka
0
140
20260516_SecJAWS_Days
takuyay0ne
2
330
そのSLO 99.9%、本当に必要ですか? 〜優先度付きSLOによる責任共有の設計思想〜 / Is that 99.9% SLO really necessary? Design philosophy of shared responsibility through prioritized SLOs
vtryo
0
600
Featured
See All Featured
Accessibility Awareness
sabderemane
1
110
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
800
Building the Perfect Custom Keyboard
takai
2
750
YesSQL, Process and Tooling at Scale
rocio
174
15k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Embracing the Ebb and Flow
colly
88
5k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
790
<Decoding/> the Language of Devs - We Love SEO 2024
nikkihalliwell
1
210
A Soul's Torment
seathinner
6
2.8k
Being A Developer After 40
akosma
91
590k
HDC tutorial
michielstock
2
650
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
550
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None