Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
68
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
NTUAI企業參訪
line_developers_tw
PRO
0
960
Data TECH FRESH企業參訪- Amber
line_developers_tw
PRO
0
1.1k
Data Team 實習分享
line_developers_tw
PRO
0
2.8k
Backend Intern之旅
line_developers_tw
PRO
0
5.5k
清大企業參訪- Ben
line_developers_tw
PRO
0
1.3k
LLM 商品規格萃取大冒險- Vila
line_developers_tw
PRO
0
1.2k
Playwright/MCP/AI -Winter
line_developers_tw
PRO
0
1.2k
LINE EC Product Catalog Development- Rei
line_developers_tw
PRO
0
1.2k
LINE 與 AI 機器人技術應用現況
line_developers_tw
PRO
0
18
Other Decks in Technology
See All in Technology
AWS re:Invent 2025で見たGrafana最新機能の紹介
hamadakoji
0
380
プロンプトやエージェントを自動的に作る方法
shibuiwilliam
10
7.7k
.NET 10の概要
tomokusaba
0
110
Edge AI Performance on Zephyr Pico vs. Pico 2
iotengineer22
0
150
AWS Security Agentの紹介/introducing-aws-security-agent
tomoki10
0
240
mairuでつくるクレデンシャルレス開発環境 / Credential-less development environment using Mailru
mirakui
5
480
エンジニアリングをやめたくないので問い続ける
estie
2
1.2k
Reinforcement Fine-tuning 基礎〜実践まで
ch6noota
0
180
Power of Kiro : あなたの㌔はパワステ搭載ですか?
r3_yamauchi
PRO
0
150
【AWS re:Invent 2025速報】AIビルダー向けアップデートをまとめて解説!
minorun365
4
520
AWSを使う上で最低限知っておきたいセキュリティ研修を社内で実施した話 ~みんなでやるセキュリティ~
maimyyym
2
1.1k
今年のデータ・ML系アップデートと気になるアプデのご紹介
nayuts
1
380
Featured
See All Featured
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
36
6.2k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Bash Introduction
62gerente
615
210k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
710
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Faster Mobile Websites
deanohume
310
31k
Fireside Chat
paigeccino
41
3.7k
Code Reviewing Like a Champion
maltzj
527
40k
Producing Creativity
orderedlist
PRO
348
40k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None