Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
34
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
LINE 實習分享 & 國際黑客松參賽分享
line_developers_tw
PRO
0
20
在 GCP 運用 Parse 全家餐管理那堆 AI 應用的資料
line_developers_tw
PRO
0
24
40歲的我會給20歲的自己,關於軟體開發的7個建議
line_developers_tw
PRO
0
7.6k
從零到一:轉碼仔的實習攻略
line_developers_tw
PRO
0
35
如何在團隊發揮數據影響力: 以電商資料科學家為例
line_developers_tw
PRO
1
47
做Data超讚的 誰懂?
line_developers_tw
PRO
0
35
iOS Live Activity: Opportunities & Challenges
line_developers_tw
PRO
1
120
掌握 Feature Toggle 與 OpenFeature 規範
line_developers_tw
PRO
0
240
用 AI 和 LINE Bot 簡化生活:讓圖片告訴你何時該忙!-- LINE 工作坊
line_developers_tw
PRO
0
770
Other Decks in Technology
See All in Technology
Goで作って学ぶWebSocket
ryuichi1208
3
1.8k
プロダクトエンジニア 360°フィードバックを実施した話
hacomono
PRO
0
100
N=1から解き明かすAWS ソリューションアーキテクトの魅力
kiiwami
0
130
AI エージェント開発を支える MaaS としての Azure AI Foundry
ryohtaka
6
590
7日間でハッキングをはじめる本をはじめてみませんか?_ITエンジニア本大賞2025
nomizone
2
1.9k
トラシューアニマルになろう ~開発者だからこそできる、安定したサービス作りの秘訣~
jacopen
2
2k
CZII - CryoET Object Identification 参加振り返り・解法共有
tattaka
0
390
依存パッケージの更新はコツコツが勝つコツ! / phpcon_nagoya2025
blue_goheimochi
2
130
2024.02.19 W&B AIエージェントLT会 / AIエージェントが業務を代行するための計画と実行 / Algomatic 宮脇
smiyawaki0820
14
3.7k
自動テストの世界に、この5年間で起きたこと
autifyhq
10
8.7k
全文検索+セマンティックランカー+LLMの自然文検索サ−ビスで得られた知見
segavvy
2
120
白金鉱業Meetup Vol.17_あるデータサイエンティストのデータマネジメントとの向き合い方
brainpadpr
6
790
Featured
See All Featured
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.4k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
120k
The World Runs on Bad Software
bkeepers
PRO
67
11k
What’s in a name? Adding method to the madness
productmarketing
PRO
22
3.3k
Speed Design
sergeychernyshev
27
790
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
133
33k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
233
17k
Making Projects Easy
brettharned
116
6k
Agile that works and the tools we love
rasmusluckow
328
21k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
A Philosophy of Restraint
colly
203
16k
Large-scale JavaScript Application Architecture
addyosmani
511
110k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None