Slide 1

Slide 1 text

Unleashing the Power of Gen AI with Enterprise Data: The Role of MySQL HeatWave's Vector Store Olivier Dasini Cloud Solutions Architect @ Oracle MySQL [email protected] Blogs : www.dasini.net/blog/en : www.dasini.net/blog/fr Linkedin : www.linkedin.com/in/olivier-dasini Slides : https://speakerdeck.com/freshdaz Oracle Dev Days - 16/05/2024

Slide 2

Slide 2 text

2 Copyright © 2024, Oracle and/or its affiliates Me, Myself & I 2 Olivier DASINI  MySQL Geek  Enjoying databases for 20+ years  Addicted to MySQL for 15+ years  MySQL Writer, Blogger and Speaker  Also former : DBA, Consultant, Architect, Trainer, ...  Cloud Solutions Architect at Oracle MySQL  Stay up to date!  Blog: www.dasini.net/blog/en  Linkedin: www.linkedin.com/in/olivier-dasini/  Slides: https://speakerdeck.com/freshdaz

Slide 3

Slide 3 text

3 Copyright © 2024, Oracle and/or its affiliates Agenda 1. MySQL HeatWave Overview 2. Generative AI 3. Vector Store 4. Use Cases

Slide 4

Slide 4 text

Copyright © 2024, Oracle and/or its affiliates MySQL HeatWave 4 Copyright © 2024, Oracle and/or its affiliates

Slide 5

Slide 5 text

Copyright © 2024, Oracle and/or its affiliates The MySQL universe - The view from the moon… 5 Copyright © 2024, Oracle and/or its affiliates MySQL Commercial/Enterprise MySQL Community + MySQL Enterprise Backup MySQL Enterprise Monitor MySQL Enterprise Authentication MySQL Enterprise Audit MySQL Enterprise TDE MySQL Enterprise Masking MySQL Enterprise Firewall MySQL Technical Support … MySQL Cluster CGE MySQL Cluster NDB + MySQL Enterprise + MySQL Cluster Manager MySQL Community MySQL Server MySQL Client, Workbench MySQL Shell MySQL GR plugin & InnoDB Cluster & Router MySQL Operator for Kubernetes MySQL Connector (C API, Java, Node.js, others) MySQL Support for MS VS Code (Preview) … MySQL Cluster NDB MySQL NDB Storage Engine MySQL NDB Operator for Kubernetes MySQL HeatWave (Cloud Services) MySQL HeatWave Databases Services (for OLTP) MySQL HeatWave Analytics (Data Warehouse) MySQL HeatWave Lakehouse MySQL HeatWave AutoML (for Machine Learning) MySQL HeatWave on AWS https://www.mysql.com/products Community, Enterprise, Cloud Services (HeatWave)

Slide 6

Slide 6 text

MySQL HeatWave 6 Copyright © 2024, Oracle and/or its affiliates Transactions, real-time analytics across data warehouse & data lake, & machine learning in 1 database service MySQL HeatWave Analytics In-database ML Autopilot OLTP Queries Results Object Store Database exports Process ALL workloads with MySQL HeatWave Database exports Streaming data Data Sources Enterprise Apps Web/Social Log files IoT MySQL storage Scales from 16 GB to 512 TB Available for $70/month Social, eCommerce, IoT, gaming, fintech apps. Analytics & ML tools

Slide 7

Slide 7 text

Copyright © 2024, Oracle and/or its affiliates TPC-DS 100TB MySQL HeatWave Snowflake 3XLarge RedShift 10 ra3.16xlarge BigQuery 3200 slots Databricks 2XLarge Hourly Cost ($) 56.43 128 86.06 74.56 103.39 Load time (hrs) 1.21 3.3 7.74 3.63 7.46 HeatWave Load advantage 2.7x 6.4x 3x 6.1x Total Time (seconds) 3,719 5,379 5,108 11,694 13,704 Price-Perf ($) 58 191 122 242 394 HeatWave price- perf advantage 3.3x 2.1x 4.1x 6.8x Best performance in the industry for query and load at the lowest price 7 Benchmark queries are derived from the TPC-DS benchmarks, but results are not comparable to published TPC-DS benchmark results since these do not comply with the TPC-DS specifications. TPC-DS 100TB

Slide 8

Slide 8 text

Copyright © 2024, Oracle and/or its affiliates • Training, inference and explanations inside database • Training is fully automated • Explainable • Fast • Secure • Scales with size of cluster • No additional cost In Database Machine Learning MySQL HeatWave AutoML Dataset Data preprocessing Algorithm selection Adaptive sampling Feature selection Hyper-parameter tuning Tuned model Model explainer Prediction explainer

Slide 9

Slide 9 text

Copyright © 2024, Oracle and/or its affiliates MySQL HeatWave AutoML use cases 12 Classification Player churn prediction Classify warranty claims Anomaly Detection Detect anomalies in supplies Predict assembly line jam Defective part identification Identify game hackers Predict when failure will occur IoT digital twin failure prediction Predict air pollution Return on advertising spend prediction Utilization demand forecasting Timeseries Forecasting Identify similar users Recommend movies to viewers Suggest substitute products Recommend new products Recommender System Loan default prediction Demand forecasting Predict flight delay Loan amount prediction Rain fall amount prediction Regression

Slide 10

Slide 10 text

Copyright © 2024, Oracle and/or its affiliates MySQL HeatWave AutoML uses a set of SQL routines 13 Machine Learning with MySQL HeatWave is so simple ● You only need to use a limited set of SQL routines: ✔ ML_TRAIN: Trains a machine learning model for a given training dataset ✔ ML_PREDICT_ROW: Makes predictions for one or more rows of data ✔ ML_PREDICT_TABLE: Makes predictions for a table of data ✔ ML_EXPLAIN_ROW: Explains predictions for one or more rows of data ✔ ML_EXPLAIN_TABLE: Explains predictions for a table of data ✔ ML_SCORE: Computes the quality of a model ✔ ML_MODEL_LOAD: Loads a machine learning model for predictions and explanations ✔ ML_MODEL_UNLOAD: Unloads a machine learning model ● In addition, with MySQL HeatWave ML, there is no need to move or reformat your data ● Data and machine learning models never leave the MySQL Database Service, which saves you time and effort while keeping your data and models secure

Slide 11

Slide 11 text

Copyright © 2024, Oracle and/or its affiliates Machine learning with HeatWave is fast faster than Redshift 25x of the cost of Redshift 1%

Slide 12

Slide 12 text

End to End Support for MySQL HeatWave 15 Copyright © 2024, Oracle and/or its affiliates From data sources to Heatwave; tooling integration and visualization Social ECommerce FinTech SaaS InnoDB HeatWave OLTP OLAP ML Tools Machine Learning Autopilot Lakehouse Database Exports MySQL HeatWave Analytics tools Database

Slide 13

Slide 13 text

Copyright © 2024, Oracle and/or its affiliates Lexicon 16 Copyright © 2024, Oracle and/or its affiliates

Slide 14

Slide 14 text

Copyright © 2024, Oracle and/or its affiliates Some definitions 17 ● Generative AI (GenAI) ✔ An artificial intelligence capable of generating text, images, videos, or other data using generative models often in response to prompts. ● Large Language Model (LLM) ✔ A computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word. ● Retrieval-Augmented Generation (RAG) ✔ A two-phase process involving document retrieval and answer formulation by a LLM. The initial phase utilizes dense embeddings to retrieve documents. This retrieval can be based on a variety of database formats depending on the use case, such as a vector database... ● Vector Store (Vector Database) ✔ A database that can store vectors (fixed-length lists of numbers) along with other data items. You can search the database with a query vector to retrieve the closest matching database records.

Slide 15

Slide 15 text

Copyright © 2024, Oracle and/or its affiliates Generative AI 18 Copyright © 2024, Oracle and/or its affiliates

Slide 16

Slide 16 text

Copyright © 2024, Oracle and/or its affiliates Generative AI in HeatWave enables new use cases Content generation & summarization • Generate insights from enterprise documents • Generate blogs from pdf instruction manuals • Summarize logs for root cause analysis Retrieval Augmented Generation (RAG) • Search on public and private enterprise data • Search on unstructured data in vector store Natural language interaction • Natural language interaction with proprietary unstructured data • Personalized content retrieval and response back in natural language Copyright © 2024, Oracle and/or its affiliates 19 +

Slide 17

Slide 17 text

Copyright © 2024, Oracle and/or its affiliates Vector store provides context to LLM for more relevant results Copyright © 2024, Oracle and/or its affiliates 20 Users can interact with MySQL HeatWave in natural language https://www.oracle.com/customers/estuda-tecnologias-educacionais/

Slide 18

Slide 18 text

Copyright © 2024, Oracle and/or its affiliates Synergy of Generative AI and AutoML: a differentiator in HeatWave Copyright © 2024, Oracle and/or its affiliates 21 Multiple advantages of combining HeatWave AutoML with Generative AI: • More accurate LLM results by filtering irrelevant data • Faster LLM inference due to smaller search space HeatWave AutoML is advantageous in structured data analysis and detection of numerical patterns Generative AI provides a natural language interface into data patterns that traditional ML uncovers and unstructured data that is retrieved from Vector Store HeatWave InnoDB HeatWave LLM Vector Store Natural language interaction with data Traditional ML

Slide 19

Slide 19 text

Copyright © 2024, Oracle and/or its affiliates Vector Store 25 Copyright © 2024, Oracle and/or its affiliates

Slide 20

Slide 20 text

Copyright © 2024, Oracle and/or its affiliates 27 Embeddings created for unstructured data in vector store [1.0, 2.0, … ] [0.5, 3.5, … ] [1.5, 3.0, … ] [1.0, 2.0, … ] [0.5, 3.5, … ] [1.0, 2.0, … ] {key1: val1, … } {key1: val1, … } {key1: val1, … } Automatically generate embedding for text from multiple file formats – PDF, DOCX, HTML, PPTX, TXT Parse Text Table Image Vector Embeddings Metadata Unstructured data Generate Vector embedding Different ML models used for different data modalities Vector Store

Slide 21

Slide 21 text

Copyright © 2024, Oracle and/or its affiliates Ingesting documents into HeatWave Vector Store with Lakehouse Copyright © 2024, Oracle and/or its affiliates 28 Document Discovery Parsing Embedding Generation Inserting into Vector Store • Discover and list unstructured data in customer buckets • Ingest these data files concurrently and in a load balanced fashion across nodes • Parse data in file formats like PDF, DOCX, HTML, PPTX, etc. leveraging existing Oracle technology • Segment parsed text and generate embeddings for segments in parallel across nodes • Insert embeddings along with metadata into Vector Store – a HeatWave Lakehouse table

Slide 22

Slide 22 text

Copyright © 2024, Oracle and/or its affiliates Native Vector Processing in MySQL HeatWave Copyright © 2024, Oracle and/or its affiliates 29 • MySQL & HeatWave supports new Vector data type • In-memory hybrid-columnar storage format for vector columns Vector Datatype • Leverage SIMD instructions for vector processing • Processes at near memory bandwidth Vector Processing • End to end data management including embedding generation • Integrated with features like in-bound replication Data Management

Slide 23

Slide 23 text

Copyright © 2024, Oracle and/or its affiliates Vector Store can be used by RAG or SQL queries Copyright © 2024, Oracle and/or its affiliates 31 Augmented prompt LLM Retrieval Agent Top suggested dishes from top recommended restaurants Recommender System Vector Store ⨝ ⨝ Restaurant suggestion SQL Queries with analytics and vector operations MySQL Tables Results using a variety of business and user data Query Results HeatWave AutoML

Slide 24

Slide 24 text

Copyright © 2024, Oracle and/or its affiliates Use Cases 36 Copyright © 2024, Oracle and/or its affiliates

Slide 25

Slide 25 text

Copyright © 2024, Oracle and/or its affiliates Use Case 1 : Employee Assistant Copyright © 2024, Oracle and/or its affiliates 37 Natural language User’s PDF Contracts in Object store Retrieval Augmented Generation Vector store ingest Employee Assistant: Improve employee productivity

Slide 26

Slide 26 text

Copyright © 2024, Oracle and/or its affiliates Synergy of Generative AI and AutoML in HeatWave Copyright © 2024, Oracle and/or its affiliates 38 A differentiator in HeatWave Advantages: • More accurate LLM results by filtering irrelevant data • Faster LLM inference due to smaller search space HeatWave InnoDB HeatWave LLM Vector Store Natural language interaction with data Traditional ML

Slide 27

Slide 27 text

Copyright © 2024, Oracle and/or its affiliates Use Case 2: Report Generation (Cohere Command Model) Copyright © 2024, Oracle and/or its affiliates 39 Incident Report - Content generation/summary Ask the question (‘What is the main problem in the following collection of logs. Provide a two-sentence summary.’) “The main problem in this collection of logs is that the memory usage of a particular process (with ID 8145) is consistently exceeding the defined memory threshold. This issue is repeatedly highlighted in the logs, and SIGTERM signals are sent to the main thread to terminate the process when it exceeds the memory threshold. Would you like me to help you with anything else regarding this collection of logs?" Report Generation: Summarization of anomalous logs

Slide 28

Slide 28 text

Copyright © 2024, Oracle and/or its affiliates Use Case 2: Report Generation Copyright © 2024, Oracle and/or its affiliates 40 Incident Report - Content generation/summary Augmented prompt LLM Produce incident reports in natural language for an operator Anomaly Detection Detect anomalous sequences of logs HeatWave AutoML Unsupervised Anomaly Detection HeatWave Generative AI Continuously ingest unstructured text logs Summarize incidents from sequences of logs Generate Incident summaries Report Generation: HW AutoML + Summarization of anomalous logs

Slide 29

Slide 29 text

Copyright © 2024, Oracle and/or its affiliates Copyright © 2024, Oracle and/or its affiliates 41 Use case 3 : Personalization - Recommend dishes based on preferences Online Food Delivery App - RAG

Slide 30

Slide 30 text

Copyright © 2024, Oracle and/or its affiliates 42 Use case 3 : Personalization - Recommend dishes based on preferences Copyright © 2024, Oracle and/or its affiliates Online Food Delivery App - RAG Recommend, Retrieve, and Generate descriptions of dishes based on user preference Personalized Menu: HW AutoML + Retrieval Augmented Generation “Tofu Curry” “Tofu Biryani” “Peas Curry” Restaurant menu

Slide 31

Slide 31 text

Copyright © 2024, Oracle and/or its affiliates Summary 43 • HeatWave enables processing data in object store or MySQL database • Best performance and price performance in the industry • Single service for machine learning, GenAI, analytics, OLTP • GenAI enables new applications • Vector processing enables querying unstructured content • Available from $70/month Real-time analytics and GenAI with MySQL HeatWave

Slide 32

Slide 32 text

Copyright © 2024, Oracle and/or its affiliates Follow us on Social Media “Data is the Oxygen of Business” 44 Copyright © 2024, Oracle and/or its affiliates

Slide 33

Slide 33 text

Copyright © 2024, Oracle and/or its affiliates Get $300 in credits and try free for 30 days Get started with MySQL HeatWave oracle.com/mysql/free Learn more about MySQL HeatWave oracle.com/mysql Request a guided workshop Ask your account manager 45 Copyright © 2024, Oracle and/or its affiliates

Slide 34

Slide 34 text

Copyright © 2024, Oracle and/or its affiliates Merci! Q&R Olivier Dasini Cloud Solutions Architect @ Oracle MySQL [email protected] Blogs : www.dasini.net/blog/en : www.dasini.net/blog/fr Linkedin : www.linkedin.com/in/olivier-dasini Twitter : @freshdaz

Slide 35

Slide 35 text

No content