Slide 1

Slide 1 text

From Search Results to Insights Learnings from Statista’s GenerativeAI Journey

Slide 2

Slide 2 text

Matthias Lau #machinelearning #developer #founder Ingo Schellhammer #biztech #cto Bene Stemmildt #socio-technical arch #networker #cto #velominatus

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

#131 (majestic million) 23m (content views/m) 500k (content downloads/m) >1m (statistics) 70k (reports)

Slide 5

Slide 5 text

66-80%

Slide 6

Slide 6 text

2/2023: Explore 1 Engineer 4/2023: Prototype 7/2023: Develop +PM +External 11/2023: Invest Full team 5/2024: Launch

Slide 7

Slide 7 text

Optimization Process

Slide 8

Slide 8 text

Retrieval Reranking Answering Rating Query Answer Answering Workflow.

Slide 9

Slide 9 text

Retrieval Reranking Answering Rating Query Answer Answering Workflow. 40 1 1

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Retrieval Reranking Answering Rating Query Answer Answering Workflow. 44% 5% 49% 3% - 13% 78% 9%

Slide 12

Slide 12 text

Latency Costs Quality 1 2 3 < 30s < 5c Metrics Relevance.

Slide 13

Slide 13 text

Reference Data.

Slide 14

Slide 14 text

Test Runner.

Slide 15

Slide 15 text

27.58 8.1c 30%

Slide 16

Slide 16 text

Many Experiments.

Slide 17

Slide 17 text

-10% -65% +140% 24.7 2.8c 72%

Slide 18

Slide 18 text

1. Add Traceability 2. Define your Metrics Relevance 3. Create a Reference Dataset 4. Measure a Baseline 5. Experiment and Measure Delta Recap Optimization Playbook.

Slide 19

Slide 19 text

Technical Learnings

Slide 20

Slide 20 text

Retrieval Query

Slide 21

Slide 21 text

Retrieval Query Rewrite Query

Slide 22

Slide 22 text

Retrieval Query Rewrite Query How tall is the Eiffel Tower? It looked so high when I was there last year? What is the height of the Eiffel Tower?

Slide 23

Slide 23 text

Retrieval Query Rewrite Query Variants Retrieval Retrieval Reranking

Slide 24

Slide 24 text

Retrieval Query Rewrite Query Variants Retrieval Retrieval Reranking Which company had more revenue 2015 to 2020, Apple or Microsoft? Apple revenue from 2015 to 2020. Microsoft revenue from 2015 to 2020. Microsoft and Apple revenue comparison from 2015 to 2020.

Slide 25

Slide 25 text

Retrieval HyDE Query Retrieve with an answer, not a question

Slide 26

Slide 26 text

Retrieval HyDE Query Which company had more revenue 2015 to 2020, Apple or Microsoft? Between 2015 and 2020, Apple consistently had higher revenue than Microsoft. Apple’s revenue grew from approximately $233.7 billion in 2015 to $274.5 billion in 2020, while Microsoft’s revenue increased from about $93.6 billion in 2015 to $143 billion in 2020.

Slide 27

Slide 27 text

Query Retrieval HyDE Retrieval Rewrite Query Variants Retrieval Retrieval Reranking

Slide 28

Slide 28 text

-10% -65% +140% 24.7 2.8c 72%

Slide 29

Slide 29 text

Recap Optimization Workflow. What about the Model?

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

There’s no One-Fits-All. Query with good Retrievals Llama 3.1 8B 👍 Query with the need for complex conclusions Llama 3.1 8B 👎

Slide 36

Slide 36 text

Business Learnings

Slide 37

Slide 37 text

From prototype…

Slide 38

Slide 38 text

to product… Pages/session Research AI: 13 keyword search: 9 Bounce rate 16,9% Requests Aug: 11k Nov: 29k

Slide 39

Slide 39 text

24.7 -10% 2.8c -65% 72% +140% 14.6 -41% 1.6c -43% 72% +/- 0% POST HEUREKA IMPROVEMENTS TODAY (Sep. ‘24) 27.6 8.1c 30% START (Sep. ‘23)

Slide 40

Slide 40 text

Meeting our quality ambition in a high-growth game 3,1 bn (+9%) 275 mn (-4%) 72 mn (+16%) 29 k (+11%) Quality Monthly visits

Slide 41

Slide 41 text

On micro-level: Go upstream where the real traffic is….

Slide 42

Slide 42 text

Customer On macro level: Where is the hottest (LLM) party? Research AI API LLM

Slide 43

Slide 43 text

Customer Research AI API LLM On macro level: Where is the hottest (LLM) party?

Slide 44

Slide 44 text

Customer Research AI API LLM On macro level: Where is the hottest (LLM) party?

Slide 45

Slide 45 text

From Project to Product 💻 ⏱ 📅 → ☁ 🔄 💎

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

👋