Slide 1

Slide 1 text

Search Search Evolution The next generation of search engines... Alexander Reelsen [email protected] | @spinscale

Slide 2

Slide 2 text

Learn about the trends in search engines Understand that this is a highly volatile market in the coming years Today's goal

Slide 3

Slide 3 text

Status quo

Slide 4

Slide 4 text

Speed (Search & Suggests) Scale (all the internet) Relevance Intent Personalization The power of search

Slide 5

Slide 5 text

Text search Enterprise search Ecommerce search Log search Analytics Dashboards NLP Generative/Conversational Search Evolution of Use-Cases

Slide 6

Slide 6 text

SQL: Does row r match query q ? Answer: / How well matches query q document d ? Answer: [0..∞] Scoring based on formula: TF/IDF , BM25 Dependent on corpus Relevancy

Slide 7

Slide 7 text

Recency Rating Popularity Past (searches/purchases) Individualization Ranking

Slide 8

Slide 8 text

Trends

Slide 9

Slide 9 text

SaaS Splitting storage and compute Using blob storage, segment replication Massive cost savings Going cloud native

Slide 10

Slide 10 text

Scoring/Relevancy based on machine learning model Common: Reranking after first filtering Machine Learning models trained independently Learning to rank

Slide 11

Slide 11 text

Vector search engines: translates content into vectors QDrant, Milvus, Weaviate, Pinecone, Deeplake, nucliadb Best model wins... Going hybrid: Will search engines add vector support or vector engines add search support? Vector Search

Slide 12

Slide 12 text

SQLite: vector extension, FTS3/4 extension Postgres: PostgresML - full model management and querying in Postgres! Don't sleep on SQL engines!

Slide 13

Slide 13 text

Distributed search across regions Search on your browser Search on your phone Check out OramaSearch Search on the edge

Slide 14

Slide 14 text

ChatGPT

Slide 15

Slide 15 text

blue dress with white stripes that has been shown on the last fashion week in milan summarize the quarterly earnings call, focus on numbers that differ strongly from the last three quarters Convert the following CDK snippet from Java to python Generative/Conversational search

Slide 16

Slide 16 text

blue dress with white stripes requires image extraction last fashion week in milan requires external knowledge Your own dataset is not enough for a good search! Generative search - context

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Prompt to any

Slide 22

Slide 22 text

futuristic skyline in neon colors with a futuristic looking tesla model 3 in the foreground Stable diffusion

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

Large size, trained on massive datasets Open Source: Langchain Prompt engineering Classification, Question Answering, Summarization, Fill-mask, Translation Hallucination & Model bias Conversational memory Learning from queries (dangerous?) Agents for LLMs (execute a calculator, SQL query, use mechanical turk) LLMs

Slide 25

Slide 25 text

Cars Mobile Voice based search

Slide 26

Slide 26 text

Summary

Slide 27

Slide 27 text

Search becomes hybrid: Will the existing search engines adapt? Search customization is expensive - A brief history of code search at GitHub Search engine becomes the commodity Rent your industry specific LLM! Privacy LLMs might be a thing Expect a lot of movement, lots of "AI integrations" and even more hot air... Summary

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

Thank you! Q & A Alexander Reelsen [email protected] | @spinscale