Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
12
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
如何在團隊發揮數據影響力: 以電商資料科學家為例
line_developers_tw
PRO
1
27
做Data超讚的 誰懂?
line_developers_tw
PRO
0
14
iOS Live Activity: Opportunities & Challenges
line_developers_tw
PRO
1
84
掌握 Feature Toggle 與 OpenFeature 規範
line_developers_tw
PRO
0
160
用 AI 和 LINE Bot 簡化生活:讓圖片告訴你何時該忙!-- LINE 工作坊
line_developers_tw
PRO
0
610
Scaling The E-Commerce Recommendation System
line_developers_tw
PRO
0
28
揭秘LLMOps: 讓LLM服務像火箭 般穩定高效的祕密!
line_developers_tw
PRO
0
69
ML Life Cycle for LINE SHOPPING Recommender
line_developers_tw
PRO
0
17
Review AI from LINE EC NLP
line_developers_tw
PRO
0
15
Other Decks in Technology
See All in Technology
初心者向けAWS Securityの勉強会mini Security-JAWSを9ヶ月ぐらい実施してきての近況
cmusudakeisuke
0
120
複雑なState管理からの脱却
sansantech
PRO
1
140
AGIについてChatGPTに聞いてみた
blueb
0
130
[FOSS4G 2024 Japan LT] LLMを使ってGISデータ解析を自動化したい!
nssv
1
210
BLADE: An Attempt to Automate Penetration Testing Using Autonomous AI Agents
bbrbbq
0
290
RubyのWebアプリケーションを50倍速くする方法 / How to Make a Ruby Web Application 50 Times Faster
hogelog
3
940
OCI Network Firewall 概要
oracle4engineer
PRO
0
4.1k
SREが投資するAIOps ~ペアーズにおけるLLM for Developerへの取り組み~
takumiogawa
1
100
ドメインの本質を掴む / Get the essence of the domain
sinsoku
2
150
Platform Engineering for Software Developers and Architects
syntasso
1
510
AWS Lambdaと歩んだ“サーバーレス”と今後 #lambda_10years
yoshidashingo
1
170
Terraform未経験の御様に対してどの ように導⼊を進めていったか
tkikuchi
2
430
Featured
See All Featured
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.3k
For a Future-Friendly Web
brad_frost
175
9.4k
10 Git Anti Patterns You Should be Aware of
lemiorhan
654
59k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Fantastic passwords and where to find them - at NoRuKo
philnash
50
2.9k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
169
50k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
10
720
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
159
15k
Gamification - CAS2011
davidbonilla
80
5k
Become a Pro
speakerdeck
PRO
25
5k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
246
1.3M
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None