Slide 1

Slide 1 text

Empowering Customer Decisions with Elasticsearch: From Search to Answer Generation Taisuke Hinata Search Engineer @ Nikkei Inc. December 3, 2024 Elastic Inc. Booth, AWS re:Invent 2024

Slide 2

Slide 2 text

Overview ● Nikkei and Our Database Business ● Elasticsearch in Our Search ● From Search to Answer Generation

Slide 3

Slide 3 text

Nikkei Inc. ● Publishing Japan’s Largest Business Newspaper ○ 3+ million subscribers ● Founded in 1876

Slide 4

Slide 4 text

News Database Media Business Index Development Education Broadcasting Publishing Culture Project

Slide 5

Slide 5 text

News Database Media Business Index Development Education Broadcasting Publishing Culture Project

Slide 6

Slide 6 text

Data to Action. Helping our customers make better decisions. https://nkbb.nikkei.co.jp/about/

Slide 7

Slide 7 text

Our Database Business 200+ Media Companies ・ ・ ・ +

Slide 8

Slide 8 text

Our Database Business All contents Content Platform (CPF) 200+ Media Companies ・ ・ ・ + Since 2018

Slide 9

Slide 9 text

Our Database Business All contents Content Platform (CPF) 200+ Media Companies ・ ・ ・ ・ ・ ・ 12 Information Search Services + Since 2018

Slide 10

Slide 10 text

● Business Search Service since 1984 ○ 750+ media sources ○ 86M+ companies ○ 300K+ individuals ● Trusted by 70% of Japan’s largest publicly listed companies ● Supports decision-making

Slide 11

Slide 11 text

Search is the Key to Our Business Decision Making Support Reliable Contents Search Technology

Slide 12

Slide 12 text

Elasticsearch in Our Search

Slide 13

Slide 13 text

Transition of Search Engines 1999 2007 2014 2023 May 2024 2018 6 8 7 on-premises PanaSearch

Slide 14

Slide 14 text

Our System Architecture Content Platform (CPF) 200+ Media Companies ・ ・ ・ ・ ・ ・ 12 Information Search Services + ETL APIs DB

Slide 15

Slide 15 text

Scale of Our Elasticsearch ● 48-node cluster in 3AZs ● 190M docs ● +15K / day ● Performance ○ 200 QPS, ≤200ms Coordinating Only Nodes (3) Data Nodes (42) Master Nodes (3) ・ ・ ・

Slide 16

Slide 16 text

One of Japan’s Largest Elasticsearch https://www.elastic.co/jp/customers/nikkei-1 https://www.acroquest.co.jp/archives/15788/

Slide 17

Slide 17 text

Powers Our All Search Features Query Suggest Search Results Display Body

Slide 18

Slide 18 text

Query Suggest Search Results Display Body Keywords are suggested Powers Our All Search Features

Slide 19

Slide 19 text

Query Suggest Search Results Display Body Results are aggregated by theme Powers Our All Search Features

Slide 20

Slide 20 text

Query Suggest Search Results Display Body Matching text is highlighted Powers Our All Search Features

Slide 21

Slide 21 text

Query Suggest Search Results Display Body Related articles are displayed together Powers Our All Search Features

Slide 22

Slide 22 text

From Search to Answer Generation

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

Steps to Knowledge: Traditional Search Question Query Select Read Answer What is TOYOTA’s business situation in Poland? 🤔 TOYOTA AND Poland AND expansion TOYOTA TOYOTA TOYOTA

Slide 25

Slide 25 text

Question Query Select Read Answer What is TOYOTA’s business situation in Poland? 🤔 TOYOTA TOYOTA TOYOTA TOYOTA AND Poland AND expansion Steps to Knowledge: Generative AI

Slide 26

Slide 26 text

How to Resolve Question DB Retriever Answer Ask doc Summarize Query Question & Docs Generative AI ● Retrieval Augmented Generation (RAG) ・・・

Slide 27

Slide 27 text

Elasticsearch Has the Edge in RAG Elasticsearch Most Vector Database Some Vector Database Create Vector Embeddings Store & Search Vector Embeddings Search Analytics Hybrid Search (text + vector) Ingest Tools(Web Crawler, connectors, Beats, Agent, API framework) Playground for testing RAG Autocomplete Choice & Flexibility of embedding models

Slide 28

Slide 28 text

Our New RAG Service with Elasticsearch

Slide 29

Slide 29 text

Company Analysis Market Analysis Economic Trends Analysis Trend Investigation Proposal Support Report Creation Gain insights into a company’s external environment and initiatives. Analyze industries with PEST analysis and market trends. Explain Nikkei Index trends and analyze CPI changes. Predict business trends and consumer behavior shifts. Create proposal stories and business roadmaps. Generate reports based on specific questions. Please enter your question.

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

Current Architecture Question 5 Sub Queries Results Article Answer Gemini Flash text-embedding -3-large Top 30 results Fetches 1000 docs (200 *5) Gemini Pro knn mseach text-embedding -3-large Chunks

Slide 34

Slide 34 text

Next Steps ● Hybrid Search (Text + Vector) ○ Now, combining Japanese text search with vector search reduces accuracy. ● Japanese Sparse Vector Model ○ Dense vectorization (1536 dim) of 190M documents requires 3.0 TB (2.35x). ○ Elastic provides ELSER as a sparse vector model, optimized for English.

Slide 35

Slide 35 text

Summary ● Nikkei, originally a newspaper company, provides cross-search services across content from 200+ partner companies. ● Search is at the core of our business, and it is powered by Elasticsearch. ● In the era of generative AI, we continue to leverage Elasticsearch to accelerate customer decision-making!