Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Unleashing the Power of Gen AI with Enterprise Data: The Role of MySQL HeatWave's Vector Store

Unleashing the Power of Gen AI with Enterprise Data: The Role of MySQL HeatWave's Vector Store

Unleash the Power of AI for Your Business with MySQL HeatWave
Uncover deeper insights and unlock the potential of large language models (LLMs) with your proprietary data.

MySQL HeatWave's latest advancements empower you to:
- Fuel highly accurate AI applications: Leverage LLMs trained on your specific data for superior results compared to models trained on public data alone.
- Ask questions in plain English: Interact with your data naturally through intuitive natural language search.
Effortlessly find what you need: Efficiently search documents across various file formats within HeatWave Lakehouse.
Learn how MySQL HeatWave can revolutionize your AI journey.

Olivier DASINI

June 05, 2024
Tweet

More Decks by Olivier DASINI

Other Decks in Programming

Transcript

  1. Unleashing the Power of Gen AI with Enterprise Data: The

    Role of MySQL HeatWave's Vector Store Olivier Dasini Cloud Solutions Architect @ Oracle MySQL [email protected] Blogs : www.dasini.net/blog/en : www.dasini.net/blog/fr Linkedin : www.linkedin.com/in/olivier-dasini Slides : https://speakerdeck.com/freshdaz Oracle Dev Days - 16/05/2024
  2. 2 Copyright © 2024, Oracle and/or its affiliates Me, Myself

    & I 2 Olivier DASINI  MySQL Geek  Enjoying databases for 20+ years  Addicted to MySQL for 15+ years  MySQL Writer, Blogger and Speaker  Also former : DBA, Consultant, Architect, Trainer, ...  Cloud Solutions Architect at Oracle MySQL  Stay up to date!  Blog: www.dasini.net/blog/en  Linkedin: www.linkedin.com/in/olivier-dasini/  Slides: https://speakerdeck.com/freshdaz
  3. 3 Copyright © 2024, Oracle and/or its affiliates Agenda 1.

    MySQL HeatWave Overview 2. Generative AI 3. Vector Store 4. Use Cases
  4. Copyright © 2024, Oracle and/or its affiliates MySQL HeatWave 4

    Copyright © 2024, Oracle and/or its affiliates
  5. Copyright © 2024, Oracle and/or its affiliates The MySQL universe

    - The view from the moon… 5 Copyright © 2024, Oracle and/or its affiliates MySQL Commercial/Enterprise MySQL Community + MySQL Enterprise Backup MySQL Enterprise Monitor MySQL Enterprise Authentication MySQL Enterprise Audit MySQL Enterprise TDE MySQL Enterprise Masking MySQL Enterprise Firewall MySQL Technical Support … MySQL Cluster CGE MySQL Cluster NDB + MySQL Enterprise + MySQL Cluster Manager MySQL Community MySQL Server MySQL Client, Workbench MySQL Shell MySQL GR plugin & InnoDB Cluster & Router MySQL Operator for Kubernetes MySQL Connector (C API, Java, Node.js, others) MySQL Support for MS VS Code (Preview) … MySQL Cluster NDB MySQL NDB Storage Engine MySQL NDB Operator for Kubernetes MySQL HeatWave (Cloud Services) MySQL HeatWave Databases Services (for OLTP) MySQL HeatWave Analytics (Data Warehouse) MySQL HeatWave Lakehouse MySQL HeatWave AutoML (for Machine Learning) MySQL HeatWave on AWS https://www.mysql.com/products Community, Enterprise, Cloud Services (HeatWave)
  6. MySQL HeatWave 6 Copyright © 2024, Oracle and/or its affiliates

    Transactions, real-time analytics across data warehouse & data lake, & machine learning in 1 database service MySQL HeatWave Analytics In-database ML Autopilot OLTP Queries Results Object Store Database exports Process ALL workloads with MySQL HeatWave Database exports Streaming data Data Sources Enterprise Apps Web/Social Log files IoT MySQL storage Scales from 16 GB to 512 TB Available for $70/month Social, eCommerce, IoT, gaming, fintech apps. Analytics & ML tools
  7. Copyright © 2024, Oracle and/or its affiliates TPC-DS 100TB MySQL

    HeatWave Snowflake 3XLarge RedShift 10 ra3.16xlarge BigQuery 3200 slots Databricks 2XLarge Hourly Cost ($) 56.43 128 86.06 74.56 103.39 Load time (hrs) 1.21 3.3 7.74 3.63 7.46 HeatWave Load advantage 2.7x 6.4x 3x 6.1x Total Time (seconds) 3,719 5,379 5,108 11,694 13,704 Price-Perf ($) 58 191 122 242 394 HeatWave price- perf advantage 3.3x 2.1x 4.1x 6.8x Best performance in the industry for query and load at the lowest price 7 Benchmark queries are derived from the TPC-DS benchmarks, but results are not comparable to published TPC-DS benchmark results since these do not comply with the TPC-DS specifications. TPC-DS 100TB
  8. Copyright © 2024, Oracle and/or its affiliates • Training, inference

    and explanations inside database • Training is fully automated • Explainable • Fast • Secure • Scales with size of cluster • No additional cost In Database Machine Learning MySQL HeatWave AutoML Dataset Data preprocessing Algorithm selection Adaptive sampling Feature selection Hyper-parameter tuning Tuned model Model explainer Prediction explainer
  9. Copyright © 2024, Oracle and/or its affiliates MySQL HeatWave AutoML

    use cases 12 Classification Player churn prediction Classify warranty claims Anomaly Detection Detect anomalies in supplies Predict assembly line jam Defective part identification Identify game hackers Predict when failure will occur IoT digital twin failure prediction Predict air pollution Return on advertising spend prediction Utilization demand forecasting Timeseries Forecasting Identify similar users Recommend movies to viewers Suggest substitute products Recommend new products Recommender System Loan default prediction Demand forecasting Predict flight delay Loan amount prediction Rain fall amount prediction Regression
  10. Copyright © 2024, Oracle and/or its affiliates MySQL HeatWave AutoML

    uses a set of SQL routines 13 Machine Learning with MySQL HeatWave is so simple • You only need to use a limited set of SQL routines: ✔ ML_TRAIN: Trains a machine learning model for a given training dataset ✔ ML_PREDICT_ROW: Makes predictions for one or more rows of data ✔ ML_PREDICT_TABLE: Makes predictions for a table of data ✔ ML_EXPLAIN_ROW: Explains predictions for one or more rows of data ✔ ML_EXPLAIN_TABLE: Explains predictions for a table of data ✔ ML_SCORE: Computes the quality of a model ✔ ML_MODEL_LOAD: Loads a machine learning model for predictions and explanations ✔ ML_MODEL_UNLOAD: Unloads a machine learning model • In addition, with MySQL HeatWave ML, there is no need to move or reformat your data • Data and machine learning models never leave the MySQL Database Service, which saves you time and effort while keeping your data and models secure
  11. Copyright © 2024, Oracle and/or its affiliates Machine learning with

    HeatWave is fast faster than Redshift 25x of the cost of Redshift 1%
  12. End to End Support for MySQL HeatWave 15 Copyright ©

    2024, Oracle and/or its affiliates From data sources to Heatwave; tooling integration and visualization Social ECommerce FinTech SaaS InnoDB HeatWave OLTP OLAP ML Tools Machine Learning Autopilot Lakehouse Database Exports MySQL HeatWave Analytics tools Database
  13. Copyright © 2024, Oracle and/or its affiliates Some definitions 17

    • Generative AI (GenAI) ✔ An artificial intelligence capable of generating text, images, videos, or other data using generative models often in response to prompts. • Large Language Model (LLM) ✔ A computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word. • Retrieval-Augmented Generation (RAG) ✔ A two-phase process involving document retrieval and answer formulation by a LLM. The initial phase utilizes dense embeddings to retrieve documents. This retrieval can be based on a variety of database formats depending on the use case, such as a vector database... • Vector Store (Vector Database) ✔ A database that can store vectors (fixed-length lists of numbers) along with other data items. You can search the database with a query vector to retrieve the closest matching database records.
  14. Copyright © 2024, Oracle and/or its affiliates Generative AI 18

    Copyright © 2024, Oracle and/or its affiliates
  15. Copyright © 2024, Oracle and/or its affiliates Generative AI in

    HeatWave enables new use cases Content generation & summarization • Generate insights from enterprise documents • Generate blogs from pdf instruction manuals • Summarize logs for root cause analysis Retrieval Augmented Generation (RAG) • Search on public and private enterprise data • Search on unstructured data in vector store Natural language interaction • Natural language interaction with proprietary unstructured data • Personalized content retrieval and response back in natural language Copyright © 2024, Oracle and/or its affiliates 19 +
  16. Copyright © 2024, Oracle and/or its affiliates Vector store provides

    context to LLM for more relevant results Copyright © 2024, Oracle and/or its affiliates 20 Users can interact with MySQL HeatWave in natural language https://www.oracle.com/customers/estuda-tecnologias-educacionais/
  17. Copyright © 2024, Oracle and/or its affiliates Synergy of Generative

    AI and AutoML: a differentiator in HeatWave Copyright © 2024, Oracle and/or its affiliates 21 Multiple advantages of combining HeatWave AutoML with Generative AI: • More accurate LLM results by filtering irrelevant data • Faster LLM inference due to smaller search space HeatWave AutoML is advantageous in structured data analysis and detection of numerical patterns Generative AI provides a natural language interface into data patterns that traditional ML uncovers and unstructured data that is retrieved from Vector Store HeatWave InnoDB HeatWave LLM Vector Store Natural language interaction with data Traditional ML
  18. Copyright © 2024, Oracle and/or its affiliates Vector Store 25

    Copyright © 2024, Oracle and/or its affiliates
  19. Copyright © 2024, Oracle and/or its affiliates 27 Embeddings created

    for unstructured data in vector store [1.0, 2.0, … ] [0.5, 3.5, … ] [1.5, 3.0, … ] [1.0, 2.0, … ] [0.5, 3.5, … ] [1.0, 2.0, … ] {key1: val1, … } {key1: val1, … } {key1: val1, … } Automatically generate embedding for text from multiple file formats – PDF, DOCX, HTML, PPTX, TXT Parse Text Table Image Vector Embeddings Metadata Unstructured data Generate Vector embedding Different ML models used for different data modalities Vector Store
  20. Copyright © 2024, Oracle and/or its affiliates Ingesting documents into

    HeatWave Vector Store with Lakehouse Copyright © 2024, Oracle and/or its affiliates 28 Document Discovery Parsing Embedding Generation Inserting into Vector Store • Discover and list unstructured data in customer buckets • Ingest these data files concurrently and in a load balanced fashion across nodes • Parse data in file formats like PDF, DOCX, HTML, PPTX, etc. leveraging existing Oracle technology • Segment parsed text and generate embeddings for segments in parallel across nodes • Insert embeddings along with metadata into Vector Store – a HeatWave Lakehouse table
  21. Copyright © 2024, Oracle and/or its affiliates Native Vector Processing

    in MySQL HeatWave Copyright © 2024, Oracle and/or its affiliates 29 • MySQL & HeatWave supports new Vector data type • In-memory hybrid-columnar storage format for vector columns Vector Datatype • Leverage SIMD instructions for vector processing • Processes at near memory bandwidth Vector Processing • End to end data management including embedding generation • Integrated with features like in-bound replication Data Management
  22. Copyright © 2024, Oracle and/or its affiliates Vector Store can

    be used by RAG or SQL queries Copyright © 2024, Oracle and/or its affiliates 31 Augmented prompt LLM Retrieval Agent Top suggested dishes from top recommended restaurants Recommender System Vector Store ⨝ ⨝ Restaurant suggestion SQL Queries with analytics and vector operations MySQL Tables Results using a variety of business and user data Query Results HeatWave AutoML
  23. Copyright © 2024, Oracle and/or its affiliates Use Cases 36

    Copyright © 2024, Oracle and/or its affiliates
  24. Copyright © 2024, Oracle and/or its affiliates Use Case 1

    : Employee Assistant Copyright © 2024, Oracle and/or its affiliates 37 Natural language User’s PDF Contracts in Object store Retrieval Augmented Generation Vector store ingest Employee Assistant: Improve employee productivity
  25. Copyright © 2024, Oracle and/or its affiliates Synergy of Generative

    AI and AutoML in HeatWave Copyright © 2024, Oracle and/or its affiliates 38 A differentiator in HeatWave Advantages: • More accurate LLM results by filtering irrelevant data • Faster LLM inference due to smaller search space HeatWave InnoDB HeatWave LLM Vector Store Natural language interaction with data Traditional ML
  26. Copyright © 2024, Oracle and/or its affiliates Use Case 2:

    Report Generation (Cohere Command Model) Copyright © 2024, Oracle and/or its affiliates 39 Incident Report - Content generation/summary Ask the question (‘What is the main problem in the following collection of logs. Provide a two-sentence summary.’) “The main problem in this collection of logs is that the memory usage of a particular process (with ID 8145) is consistently exceeding the defined memory threshold. This issue is repeatedly highlighted in the logs, and SIGTERM signals are sent to the main thread to terminate the process when it exceeds the memory threshold. Would you like me to help you with anything else regarding this collection of logs?" Report Generation: Summarization of anomalous logs
  27. Copyright © 2024, Oracle and/or its affiliates Use Case 2:

    Report Generation Copyright © 2024, Oracle and/or its affiliates 40 Incident Report - Content generation/summary Augmented prompt LLM Produce incident reports in natural language for an operator Anomaly Detection Detect anomalous sequences of logs HeatWave AutoML Unsupervised Anomaly Detection HeatWave Generative AI Continuously ingest unstructured text logs Summarize incidents from sequences of logs Generate Incident summaries Report Generation: HW AutoML + Summarization of anomalous logs
  28. Copyright © 2024, Oracle and/or its affiliates Copyright © 2024,

    Oracle and/or its affiliates 41 Use case 3 : Personalization - Recommend dishes based on preferences Online Food Delivery App - RAG
  29. Copyright © 2024, Oracle and/or its affiliates 42 Use case

    3 : Personalization - Recommend dishes based on preferences Copyright © 2024, Oracle and/or its affiliates Online Food Delivery App - RAG Recommend, Retrieve, and Generate descriptions of dishes based on user preference Personalized Menu: HW AutoML + Retrieval Augmented Generation “Tofu Curry” “Tofu Biryani” “Peas Curry” Restaurant menu
  30. Copyright © 2024, Oracle and/or its affiliates Summary 43 •

    HeatWave enables processing data in object store or MySQL database • Best performance and price performance in the industry • Single service for machine learning, GenAI, analytics, OLTP • GenAI enables new applications • Vector processing enables querying unstructured content • Available from $70/month Real-time analytics and GenAI with MySQL HeatWave
  31. Copyright © 2024, Oracle and/or its affiliates Follow us on

    Social Media “Data is the Oxygen of Business” 44 Copyright © 2024, Oracle and/or its affiliates
  32. Copyright © 2024, Oracle and/or its affiliates Get $300 in

    credits and try free for 30 days Get started with MySQL HeatWave oracle.com/mysql/free Learn more about MySQL HeatWave oracle.com/mysql Request a guided workshop Ask your account manager 45 Copyright © 2024, Oracle and/or its affiliates
  33. Copyright © 2024, Oracle and/or its affiliates Merci! Q&R Olivier

    Dasini Cloud Solutions Architect @ Oracle MySQL [email protected] Blogs : www.dasini.net/blog/en : www.dasini.net/blog/fr Linkedin : www.linkedin.com/in/olivier-dasini Twitter : @freshdaz