Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
66
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
從混亂到優雅,讓專案不再失控:ATDD 與 Clean Architecture 的後端實戰之路
line_developers_tw
PRO
0
7
2049智能共存:透過LINE Bot Agent迎接後人類時代
line_developers_tw
PRO
0
35
菸酒生在 LINE Taiwan 的後端雙刀流
line_developers_tw
PRO
0
1.4k
讓測試不再 BB! 從 BDD 到 CI/CD, 不靠人力也能 MVP
line_developers_tw
PRO
0
1.4k
DB 醬,嗨!哪泥嘎斯基?
line_developers_tw
PRO
0
1.4k
比起獨自升級 我更喜歡 DevOps 文化 <3
line_developers_tw
PRO
0
1.4k
工具人的一生: 開發很多 AI 工具讓我 慵懶過一生
line_developers_tw
PRO
0
1.4k
從四件事帶你見識見識 事件驅動架構設計 (EDA)
line_developers_tw
PRO
0
1.3k
TODAY 看世界(?) 是我們在看扣啦!
line_developers_tw
PRO
0
1.4k
Other Decks in Technology
See All in Technology
結局QUICで通信は速くなるの?
kota_yata
9
7.5k
LLMエージェント時代に適応した開発フロー
hiragram
1
360
マイクロモビリティシェアサービスを支える プラットフォームアーキテクチャ
grimoh
1
160
GCASアップデート(202506-202508)
techniczna
0
240
信頼できる開発プラットフォームをどう作るか?-Governance as Codeと継続的監視/フィードバックが導くPlatform Engineeringの進め方
yuriemori
1
400
Observability for LLM Application lifecycle
ivry_presentationmaterials
1
230
Rethinking Incident Response: Context-Aware AI in Practice - Incident Buddy Edition -
rrreeeyyy
0
130
AIとTDDによるNext.js「隙間ツール」開発の実践
makotot
4
130
人を動かすことについて考える
ichimichi
2
300
[CVPR2025論文読み会] Linguistics-aware Masked Image Modelingfor Self-supervised Scene Text Recognition
s_aiueo32
0
210
ZOZOTOWNフロントエンドにおけるディレクトリの分割戦略
zozotech
PRO
13
4.2k
はじめての転職講座/The Guide of First Career Change
kwappa
5
4.5k
Featured
See All Featured
Imperfection Machines: The Place of Print at Facebook
scottboms
268
13k
Adopting Sorbet at Scale
ufuk
77
9.5k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3.4k
Gamification - CAS2011
davidbonilla
81
5.4k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.5k
It's Worth the Effort
3n
187
28k
GraphQLとの向き合い方2022年版
quramy
49
14k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.6k
Balancing Empowerment & Direction
lara
2
580
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None