Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Enhanced EC Recommendations: Trustworthy Validation with Large Language Models for Two-Tower Model EC Data Dev / Data Scientists Dan Chen

Slide 3

Slide 3 text

Dan LINE Taiwan EC Dev - Data Scientis Work Experience Side Project

Slide 4

Slide 4 text

01 02 03 04 Evaluation Framework Offline & Online Evaluation LLM on Recommendation What is Trustworthy 05 Q&A CONTENT

Slide 5

Slide 5 text

Why it’s so important 01 What is Trustworthy

Slide 6

Slide 6 text

Element of trustworthy 特點項目文字 特點項目 Trustworthy 特點項目文字 特點項目 特點項目文字 特點項目

Slide 7

Slide 7 text

Four Perspective 特點項目文字 特點項目 Trustworthy Recommendation 特點項目文字 特點項目 特點項目文字 特點項目 Data Preparation Data Representation Recommendation Generation Performance Evaluation

Slide 8

Slide 8 text

How to Correctly Evaluate AI 02 Evaluation Framework

Slide 9

Slide 9 text

Two - Stage Recommendation system Brickmaster Scalable Scenario-wise KPI - Oriented Trustworthy

Slide 10

Slide 10 text

How to truly comprehensive understand performance Evaluation Framework (1/2)

Slide 11

Slide 11 text

How to truly comprehensive understand performance Evaluation Framework (1/2)

Slide 12

Slide 12 text

How to Correctly Evaluate AI 03 Offline & Online Evaluation

Slide 13

Slide 13 text

Key point to show how your algorithms can contribute to your business Offline Evaluation

Slide 14

Slide 14 text

Key point to show how your algorithms can contribute to your business Online Evaluation

Slide 15

Slide 15 text

Avoid pitfalls In Practice If experiment isn’t’ significant ?? Sample ratio mismatch ?? Novelty effect ?? Key point to show how your algorithms can contribute to your business A/B test

Slide 16

Slide 16 text

Case – EC Shop recommendation

Slide 17

Slide 17 text

04 LLM On Recommendation

Slide 18

Slide 18 text

Recommendation with LLM - Feature Engineering: Text embedding generation - How to evaluate embedding (probing): RankMe / α-ReQ Metrincs

Slide 19

Slide 19 text

Recommendation with LLM - Feature Engineering: Text embedding generation - How to evaluate embedding (probing): RankMe / α-ReQ Metrincs

Slide 20

Slide 20 text

Evaluate & Challenge 05 Conclusion

Slide 21

Slide 21 text

Conclusion Business Value OpenAI, Claude, Gemini XGBoost or OpenSource 來源:https://zh.wikipedia.org/zh- tw/%E7%BE%8E%E5%9C%8B%E9%9A%8A%E9%95%B72%EF%BC%9A%E9%85%B7%E5%AF%9 2%E6%88%B0%E5%A3%AB 來源:https://images.app.goo.gl/HCygtJVtoPaU2KgX6

Slide 22

Slide 22 text

Conclusion & Challenge 1. Data Quality 2. Multiple – Metrics evaluation 3. Conduct A/B test Experiment 4. Human Perception Evaluation Challenge

Slide 23

Slide 23 text

Q&A 聯絡資訊 (Linkedin – Dan Chen)

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content