Slide 1

Slide 1 text

MySQL HeatWave Lakehouse Olivier Dasini MySQL Cloud Principal Solutions Architect EMEA [email protected] Blogs : www.dasini.net/blog/en : www.dasini.net/blog/fr Linkedin: www.linkedin.com/in/olivier-dasini August 2023 Modernize MySQL & non-MySQL workloads with MySQL HeatWave

Slide 2

Slide 2 text

Copyright © 2023, Oracle and/or its affiliates. All rights reserved. 2 Me, Myself & I  MySQL Geek  Addicted to MySQL for 15+ years  Playing with databases for 20+ years  MySQL Writer, Blogger and Speaker  Also: DBA, Consultant, Architect, Trainer, ...  MySQL Cloud Principal Solutions Architect EMEA at Oracle  Stay up to date!  Blog: www.dasini.net/blog/en  Linkedin: www.linkedin.com/in/olivier-dasini/  Twitter: @freshdaz Olivier DASINI

Slide 3

Slide 3 text

45 regions in 23 countries including Paris & Marseille; 12 Azure Interconnect Regions Oracle Cloud Infrastructure Global Locations MySQL HeatWave Databases Service(s) is/are part of all of them MySQL HeatWave Databases Service(s) is/are part of all of them And also Cloud @Customer & EU Soveriegn Cloud 100% renewable energy by 2025 3 Copyright © 2023, Oracle and/or its affiliates August 2023 https://www.oracle.com/cloud/public-cloud-regions

Slide 4

Slide 4 text

Oracle Cloud Infrastructure Europe Locations MySQL HeatWave Databases Service(s) is/are part of all of them MySQL HeatWave Databases Service(s) is/are part of all of them 4 Copyright © 2023, Oracle and/or its affiliates https://www.oracle.com/cloud/public-cloud-regions August 2023

Slide 5

Slide 5 text

MySQL HeatWave – optimized for analytics, machine learning and OLTP 5 Copyright © 2023, Oracle and/or its affiliates

Slide 6

Slide 6 text

Existing applications work without changes Oracle Analytics Cloud is integrated with MySQL HeatWave Oracle Analytics Cloud is integrated with MySQL HeatWave 6 Copyright © 2023, Oracle and/or its affiliates

Slide 7

Slide 7 text

Already best performance in industry for data warehouse TPC-H 10TB TPC-H 10TB 13x better than Redshift 28x better than Snowflake 28x better than BigQuery 62x better than Databricks 10X ra3.4xlarge X-Large Cluster 800 slots Large Cluster Benchmark queries are derived from the TPC-H benchmarks, but results are not comparable to published TPC-H benchmark results since these do not comply with the TPC-H specifications. 4.2x faster than Redshift 3.3x faster than Snowflake 5.6x faster than BigQuery 7.4x faster than Databricks 7 Copyright © 2023, Oracle and/or its affiliates https://www.oracle.com/mysql/heatwave/performance/#heatwave-on-oci

Slide 8

Slide 8 text

Already lowest cost in industry for data warehouse TPC-H 10TB price performance comparison TPC-H 10TB price performance comparison 13x better than Redshift 28x better than Snowflake 28x better than BigQuery 62x better than Databricks 3 year reserved, paid upfront Standard Edition 1 year reserved 1 year reserved Benchmark queries are derived from the TPC-H benchmarks, but results are not comparable to published TPC-H benchmark results since these do not comply with the TPC-H specifications. 8 Copyright © 2023, Oracle and/or its affiliates https://www.oracle.com/mysql/heatwave/performance/#heatwave-on-oci

Slide 9

Slide 9 text

Classification Classify warranty claims Identify similar users Recommend movies Recommender System Loan default prediction Predict flight delay Rain fall prediction Regression Predict Advt spend ROI Demand forecasting Anomaly Detection Detect anomalous credit card spend Identify game hacker Fully automated in-database machine learning Training, inference, explanation with HeatWave AutoML Training, inference, explanation with HeatWave AutoML • In-database • Secure • Fully automated • 25x faster than Redshift ML • Explainable • No additional cost Time-series forecasting 9 Copyright © 2023, Oracle and/or its affiliates

Slide 10

Slide 10 text

MySQL HeatWave is available in multiple clouds Optimized for price performance in each cloud 10 Copyright © 2023, Oracle and/or its affiliates

Slide 11

Slide 11 text

MySQL HeatWave on AWS 11 Copyright © 2023, Oracle and/or its affiliates • MySQL HeatWave runs natively on AWS, optimized for AWS infrastructure • Data doesn’t leave AWS – saves egress cost, and avoids compliance approvals • Lowest latency access to MySQL HeatWave • Tight integration with the AWS ecosystem – S3, CloudWatch, PrivateLink • Easier migration from other databases (e.g., Amazon Aurora, Redshift, Snowflake) OCI and AWS Regions – August 2023 Create & manage a MySQL DB System with a HeatWave Cluster to use with AWS applications Create & manage a MySQL DB System with a HeatWave Cluster to use with AWS applications https://dev.mysql.com/doc/heatwave-aws/en

Slide 12

Slide 12 text

MySQL HeatWave on Azure 12 Copyright © 2023, Oracle and/or its affiliates • Familiar Azure-native user experience • Automated identity, networking, and monitoring integration • Private interconnect and networking with < 2 ms latency • Use Microsoft Azure services with MySQL HeatWave • Collaborative support https://www.oracle.com/cloud/azure/oracle-database-for-azure Connecting to MySQL HeatWave on OCI from Azure VNET Connecting to MySQL HeatWave on OCI from Azure VNET

Slide 13

Slide 13 text

MySQL HeatWave customer momentum Data warehouse, machine learning and OLTP workloads Data warehouse, machine learning and OLTP workloads https://www.oracle.com/customers/?product=mpd-cld-infra:db-services:mysql-heatwave 13 Copyright © 2023, Oracle and/or its affiliates

Slide 14

Slide 14 text

• Databases are systems of record • Files are repository for other types of data (e.g IoT, web content, log files) • 99.5% of collected data remains unused Massive amount of data stored outside databases Object Store Devices Sensors Social Events 14 Copyright © 2023, Oracle and/or its affiliates

Slide 15

Slide 15 text

Announced General Availability of MySQL HeatWave Lakehouse 15 Copyright © 2023, Oracle and/or its affiliates

Slide 16

Slide 16 text

Object Store Query InnoDB AWS Aurora export Redshift export HeatWave Lakehouse can query object store and MySQL database OLTP Analytics Autopilot Machine Learning Lakehouse Data stays in object store, processed by HeatWave Data stays in object store, processed by HeatWave MySQL Autopilot 16 Copyright © 2023, Oracle and/or its affiliates

Slide 17

Slide 17 text

HeatWave Lakehouse Query data in object storage • Querying in HeatWave • Scale to 512 nodes, 512 TB • CSV, Parquet, Aurora & Redshift exports • Fastest load, Best price-performance Use standard MySQL syntax Combine OLTP data with object store data • 100% compatible with MySQL syntax • Use MySQL Autopilot to auto-infer schema, estimate capacity, load times, and generate load scripts • Treat data lake data as tables • Use select, join, aggregations, filters, etc… to combine data in OLTP tables with data lake tables Main benefits Main benefits 17 Copyright © 2023, Oracle and/or its affiliates 1 2 3

Slide 18

Slide 18 text

HeatWave scales out Scale to any cluster size • Scale to any size upto 512 nodes • Scale up or scale down Real-time scaling Highly scalable • System is always available for all operations • Data in the cluster is always balanced • Query performance scales very well with cluster size • Load performance scales very well with cluster size Flexible, fast, and highly scalable Flexible, fast, and highly scalable 18 Copyright © 2023, Oracle and/or its affiliates

Slide 19

Slide 19 text

Fully compatible MySQL syntax generated by Autopilot, no human required Fully compatible MySQL syntax generated by Autopilot, no human required Three simple steps to query data in object store 1. Run MySQL Autopilot on data in object store mysql> CALL sys.heatwave_load(,); OUTPUT: DDLs automatically generated 2. Execute DDLs generated by Autopilot mysql> CREATE TABLE `cust1DB`.`Sensor` (date DATE, degree INT) -> ENGINE=Lakehouse SECONDARY_ENGINE=Rapid -> ENGINE_ATTRIBUTE = ‘{“file”:[{“prefix”:”sensor1-April”, “par”:””}]}’; mysql> ALTER TABLE `cust1DB`.`Sensor` SECONDARY_LOAD; 3. Query across file and table mysql> SELECT count(*) FROM Sensor, SALES WHERE Sensor.degrees > 30 AND Sensor.date = SALES.date; 19 Copyright © 2023, Oracle and/or its affiliates

Slide 20

Slide 20 text

MySQL Autopilot: machine learning-powered automation Workload aware automation for analytics, OLTP and object store Workload aware automation for analytics, OLTP and object store 20 Copyright © 2023, Oracle and/or its affiliates

Slide 21

Slide 21 text

Auto provisioning with MySQL HeatWave Lakehouse How to determine the right cluster size required for processing data in object store? 21 Copyright © 2023, Oracle and/or its affiliates

Slide 22

Slide 22 text

Auto schema inference with HeatWave Lakehouse …Even for files that don’t have metadata! 22 Copyright © 2023, Oracle and/or its affiliates

Slide 23

Slide 23 text

Inferring schema mapping from file metadata has limitations Adaptive data sampling in MySQL Autopilot is fast and accurate Adaptive data sampling in MySQL Autopilot is fast and accurate MySQL Autopilot performance on 1 node … Adaptive data sampling N 1 C 2 C N … TPCH LINEITEM (TB) Autopilot time (sec) 1 TB 8 20 TB 13 75 TB 15 200 TB 25 300 TB 40 400 TB 47 23 Copyright © 2023, Oracle and/or its affiliates

Slide 24

Slide 24 text

Load performance MySQL HeatWave is faster to load data and less expensive MySQL HeatWave is faster to load data and less expensive 500 TB TPCH HeatWave Lakehouse Snowflake Redshift Databricks Google BigQuery Annual Cost $1,742,036 $2,300,160 $1,544,268 $1,822,817 $1,446,900 Pricing Term PAYG Standard Edition 1 year upfront 1 year reserved 1 year reserved Load Time (hrs) 4.43 9.04 (2x) 40.86 (9.2x) 25.42 (5.7x) 38.2 (8.6x) 24 Copyright © 2023, Oracle and/or its affiliates https://www.oracle.com/mysql/heatwave/performance/#heatwave-lakehouse

Slide 25

Slide 25 text

Load & Query performance comparison – best in the industry 500 TB TPCH HeatWave Lakehouse Snowflake Redshift Databricks Google BigQuery Annual Cost $1,742,036 $2,300,160 $1,544,268 $1,822,817 $1,446,900 Pricing Term PAYG Standard Edition 1 year upfront 1 year reserved 1 year reserved Load Time (hrs) 4.43 9.04 (2x slower) 40.86 (9.2x slower) 25.42 (5.7x slower) 38.2 (8.6x slower) Query Time 2,150 sec 39,040 sec (18x slower) 32,715 sec (15x slower) 37,729 sec (17x slower) 76,180 sec (35x slower) 25 Copyright © 2023, Oracle and/or its affiliates MySQL HeatWave is faster to load & query data and still less expensive MySQL HeatWave is faster to load & query data and still less expensive https://www.oracle.com/mysql/heatwave/performance/#heatwave-lakehouse

Slide 26

Slide 26 text

MySQL Autopilot boosts query performance of HeatWave Lakehouse Optimizer learns and improves query plan based on previous queries executed Optimizer learns and improves query plan based on previous queries executed A B C ⨝ ⨝ Query #1 A B C ⨝ ⨝ Autopilot Statistics Query #2 A B ⨝ U D A B D U ⨝ 26 Copyright © 2023, Oracle and/or its affiliates

Slide 27

Slide 27 text

Provides flexibility to develop applications on object store without any performance, cost impact Provides flexibility to develop applications on object store without any performance, cost impact Same query performance when data inside MySQL or in object store HeatWave HeatWave Lakehouse Snowflake Redshift Google Big Query Databricks 0 20 40 60 80 100 120 14.2 14.2 47 59 79 105 10 TB TPC-H Query performance Query time (seconds) 10 HeatWave nodes, X-Large cluster for Snowflake; 10 nodes of ra3.4xlarge for Redshift; 800 slots for Google BigQuery; Large cluster for Databricks *Benchmark queries are derived from the TPC-H benchmark, but results are not comparable to published TPC-H benchmark results since these do not comply with the TPC-H specifications. 27 Copyright © 2023, Oracle and/or its affiliates https://www.oracle.com/mysql/heatwave/performance/#heatwave-lakehouse

Slide 28

Slide 28 text

Provides flexibility to develop applications on object store without any performance, cost impact Provides flexibility to develop applications on object store without any performance, cost impact Same price-performance when data inside MySQL or in object store HeatWave HeatWave Lakehouse Snowflake Redshift Google Big Query Databricks 0 10 20 30 40 50 60 70 80 90 100 1.5 1.5 41.9 20.2 41.4 92.5 10TB TPC-H Price-Performance Price-Performance (cents) • 10 HeatWave Nodes, X-Large cluster for Snowflake; 10 nodes of ra3.4xlarge for Redshift; 800 slots for Google BigQuery; Large cluster for Databricks • Standard edition price for Snowflake; 3 yr upfront price for Redshift; 1 year reserved price for Google BigQuery and Databricks 28 Copyright © 2023, Oracle and/or its affiliates https://www.oracle.com/mysql/heatwave/performance/#heatwave-lakehouse

Slide 29

Slide 29 text

HeatWave Lakehouse is integrated with VS-Code for MySQL shell 29 Copyright © 2023, Oracle and/or its affiliates

Slide 30

Slide 30 text

1. Designed to process non-MySQL workloads 2. Best query performance and load performance for data warehouse 3. Query data in object store and OLTP data in MySQL database 4. Data in object store remains in object store 5. MySQL Autopilot for automating data management 6. HeatWave scales to 512 HeatWave nodes and 1/2 Petabyte data Summary Functionality available with MySQL HeatWave in all OCI regions Functionality available with MySQL HeatWave in all OCI regions 30 Copyright © 2023, Oracle and/or its affiliates

Slide 31

Slide 31 text

For more information: https://www.oracle.com/heatwave/#lakehouse 31 Copyright © 2023, Oracle and/or its affiliates

Slide 32

Slide 32 text

Get $300 in credits and try free for 30 days Get started with MySQL HeatWave oracle.com/mysql/free Learn more about MySQL HeatWave oracle.com/mysql Request a guided workshop Ask your account manager 32 Copyright © 2023, Oracle and/or its affiliates

Slide 33

Slide 33 text

Follow us on Social Media “Data is the Oxygen of Business” 33 Copyright © 2023, Oracle and/or its affiliates

Slide 34

Slide 34 text

Copyright © 2023, Oracle and/or its affiliates. All rights reserved. 34 Merci! Q&R Olivier Dasini MySQL Cloud Principal Solutions Architect EMEA [email protected] Blogs : www.dasini.net/blog/en : www.dasini.net/blog/fr Linkedin: www.linkedin.com/in/olivier-dasini Twitter : @freshdaz

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

“HeatWave Lakehouse scales out very well for loading data from object storage and for running queries on object store… This scale out characteristic of HeatWave Lakehouse for data management is key to efficiently process very large amounts of data.” Henry Tullis Leader, Cloud Infrastructure and Engineering Deloitte Consulting

Slide 37

Slide 37 text

What are industry analysts saying about MySQL HeatWave Lakehouse? “MySQL HeatWave demonstrates that Lakehouse performance can be identical to transaction query performance—unheard of and even unthinkable.” “For HeatWave Lakehouse to deliver record performance for both loading data and querying data is an unprecedented innovation in cloud data services.” “The ability of HeatWave to load and query data on such a massive number of nodes in parallel is the first in the industry.” “MySQL HeatWave Lakehouse is not your typical analytical database architecture, and its design engineering will continue to push the competitive market forward.” “Data lakehouses are meant to bridge the gap between data warehouses and data lakes... MySQL HeatWave Lakehouse takes that a step further by making cloud object storage a first-class citizen.” “Simply put: MySQL HeatWave Lakehouse enables you to stay ahead of the competition by taking swift action on meaningful business insights.” “Organizations looking for the best value in the cloud data lakehouse landscape must seriously consider MySQL HeatWave Lakehouse.” “MySQL HeatWave Lakehouse takes customers to a new level of capabilities” “MySQL HeatWave Lakehouse can simplify the life of data management professionals and should improve the customer experience.” “In the era of AI, the ability to process data is the absolute demarcation between companies that are going to get productivity and outcomes and those that won't…” “The performance against the big names is pretty incredible you know…when you talk about a highly specialized accelerated workload, this is a tremendously powerful use case...”