Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
5
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
Scaling The E-Commerce Recommendation System
line_developers_tw
PRO
0
9
揭秘LLMOps: 讓LLM服務像火箭 般穩定高效的祕密!
line_developers_tw
PRO
0
16
ML Life Cycle for LINE SHOPPING Recommender
line_developers_tw
PRO
0
9
Review AI from LINE EC NLP
line_developers_tw
PRO
0
6
LINE購物 App x ATDD: 利用 ATDD 改善開發流程
line_developers_tw
PRO
0
17
Grafana Alloy Best Practice
line_developers_tw
PRO
0
900
Distributed Tracing in LINE Taiwan
line_developers_tw
PRO
0
28
只有 Status page 還不夠!講人話才知道 Infra 發生什麼事
line_developers_tw
PRO
2
260
LINE Chatbot 的終極進化:如何使用 Gemini、多模態和 Gemma 突破對話式 AI 的界限
line_developers_tw
PRO
0
450
Other Decks in Technology
See All in Technology
誰でもできる!OpenAI Embedding API を活用して高度なレコメンド機能を実現してみよう - A story about implementing an advanced recommendation function using the OpenAI Embedding API
sugoikondo
2
170
QAに対する超個人的な解釈 / Personal Take on QA
toma_sm
1
180
Godot Engine でゲームを作ろう!リーダブルノードのススメ
kamera25
0
110
CDK Pipelinesをざっくり理解する
smt7174
0
200
公共交通データとアプリ制作 - Mini Tokyo 3D の初期制作過程を振り返る
nagix
2
200
無料版Copilot×Google ColabでPDFデータを分析してみよう!!
kudou55
1
130
【虎の穴ラボ Tech Talk】虎の穴ラボTech Talk説明資料
toranoana
0
120
白金鉱業Meetup Vol.15 効果検証の怖い話_tomokazuABE_20240919
brainpadpr
4
790
Walking the minefield of Service Mesh
drequena
0
170
Plartform TeamのないPlatform Engineering
yashiro
2
290
Understanding and Optimising INP
akshayysharma
0
130
Maps with Django - DjangoCon US 2024
pauloxnet
0
140
Featured
See All Featured
Principles of Awesome APIs and How to Build Them.
keavy
125
17k
How To Stay Up To Date on Web Technology
chriscoyier
786
250k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
278
13k
Web Components: a chance to create the future
zenorocha
309
42k
For a Future-Friendly Web
brad_frost
174
9.3k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
0
160
10 Git Anti Patterns You Should be Aware of
lemiorhan
653
59k
Happy Clients
brianwarren
96
6.6k
Thoughts on Productivity
jonyablonski
67
4.2k
Build your cross-platform service in a week with App Engine
jlugia
228
18k
Six Lessons from altMBA
skipperchong
26
3.4k
Debugging Ruby Performance
tmm1
72
12k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None