Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enhanced EC Recommendations: Trustworthy Valida...
Search
LINE Developers Taiwan
PRO
September 23, 2024
Technology
0
68
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model
Event: iThome Hello World Dev Conference
Speaker: Dan Chen
LINE Developers Taiwan
PRO
September 23, 2024
Tweet
Share
More Decks by LINE Developers Taiwan
See All by LINE Developers Taiwan
NTUAI企業參訪
line_developers_tw
PRO
0
1.1k
Data TECH FRESH企業參訪- Amber
line_developers_tw
PRO
0
2k
Data Team 實習分享
line_developers_tw
PRO
0
3.1k
Backend Intern之旅
line_developers_tw
PRO
0
5.8k
清大企業參訪- Ben
line_developers_tw
PRO
0
1.4k
LLM 商品規格萃取大冒險- Vila
line_developers_tw
PRO
0
1.3k
Playwright/MCP/AI -Winter
line_developers_tw
PRO
0
1.3k
LINE EC Product Catalog Development- Rei
line_developers_tw
PRO
0
1.3k
LINE 與 AI 機器人技術應用現況
line_developers_tw
PRO
0
19
Other Decks in Technology
See All in Technology
AIの長期記憶と短期記憶の違いについてAgentCoreを例に深掘ってみた
yakumo
4
400
JEDAI認定プログラム JEDAI Order 2026 エントリーのご案内 / JEDAI Order 2026 Entry
databricksjapan
0
130
会社紹介資料 / Sansan Company Profile
sansan33
PRO
11
390k
AIプラットフォームにおけるMLflowの利用について
lycorptech_jp
PRO
1
170
Jakarta Agentic AI Specification - Status and Future
reza_rahman
0
110
Snowflakeでデータ基盤を もう一度作り直すなら / rebuilding-data-platform-with-snowflake
pei0804
6
1.6k
.NET 10の概要
tomokusaba
0
110
大企業でもできる!ボトムアップで拡大させるプラットフォームの作り方
findy_eventslides
1
820
2025年 開発生産「可能」性向上報告 サイロ解消からチームが能動性を獲得するまで/ 20251216 Naoki Takahashi
shift_evolve
PRO
1
200
Lambdaの常識はどう変わる?!re:Invent 2025 before after
iwatatomoya
1
620
[デモです] NotebookLM で作ったスライドの例
kongmingstrap
0
160
SREには開発組織全体で向き合う
koh_naga
0
360
Featured
See All Featured
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
70k
GraphQLとの向き合い方2022年版
quramy
50
14k
Git: the NoSQL Database
bkeepers
PRO
432
66k
How to Think Like a Performance Engineer
csswizardry
28
2.4k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.3k
Designing Experiences People Love
moore
143
24k
Building Adaptive Systems
keathley
44
2.9k
The Cost Of JavaScript in 2023
addyosmani
55
9.4k
Context Engineering - Making Every Token Count
addyosmani
9
520
Measuring & Analyzing Core Web Vitals
bluesmoon
9
710
KATA
mclloyd
PRO
33
15k
Transcript
None
Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for
Two-Tower Model EC Data Dev / Data Scientists Dan Chen
Dan LINE Taiwan EC Dev - Data Scientis Work Experience
Side Project
01 02 03 04 Evaluation Framework Offline & Online Evaluation
LLM on Recommendation What is Trustworthy 05 Q&A CONTENT
Why it’s so important 01 What is Trustworthy
Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目
Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目
Data Preparation Data Representation Recommendation Generation Performance Evaluation
How to Correctly Evaluate AI 02 Evaluation Framework
Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI -
Oriented Trustworthy
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to truly comprehensive understand performance Evaluation Framework (1/2)
How to Correctly Evaluate AI 03 Offline & Online Evaluation
Key point to show how your algorithms can contribute to
your business Offline Evaluation
Key point to show how your algorithms can contribute to
your business Online Evaluation
Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample
ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test
Case – EC Shop recommendation
04 LLM On Recommendation
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Recommendation with LLM - Feature Engineering: Text embedding generation -
How to evaluate embedding (probing): RankMe / α-ReQ Metrincs
Evaluate & Challenge 05 Conclusion
Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh-
tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6
Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics
evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge
Q&A 聯絡資訊 (Linkedin – Dan Chen)
None
None