Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MySQL HeatWave Lakehouse short overview

MySQL HeatWave Lakehouse short overview

Big Data & AI Paris 2023

Unlock the power of Big Data analytics with MySQL HeatWave Lakehouse!

Olivier DASINI

October 03, 2023
Tweet

More Decks by Olivier DASINI

Other Decks in Technology

Transcript

  1. MySQL HeatWave Lakehouse
    Olivier Dasini
    MySQL Cloud Principal Solutions Architect EMEA
    [email protected]
    Blogs : www.dasini.net/blog/en
    : www.dasini.net/blog/fr
    Linkedin: www.linkedin.com/in/olivier-dasini
    Déverrouillez le pouvoir de l'analyse Big Data avec MySQL HeatWave Lakehouse !
    Big Data & AI Paris - September 2023

    View full-size slide

  2. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    Agenda
    1. MySQL HeatWave Overview
    2. MySQL HeatWave Lakehouse
    3. Demo
    4. Summary
    2

    View full-size slide

  3. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    3
    MySQL HeatWave
    Overview

    View full-size slide

  4. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    4
    … the MySQL Cloud services made by the MySQL Team
    The MySQL HeatWave “Family”
    “MySQL HeatWave”
    MySQL HeatWave for OLTP
    a.k.a MDS: (OLTP)
    General Purpose
    Optimized for OLTP
    MySQL HeatWave Lakehouse
    (Lakehouse)
    MySQL HeatWave
    (Analytics / ML)
    OLTP + Analytics (OLAP)
    + Machine Learning
    InnoDB
    Lakehouse
    InnoDB RAPID InnoDB RAPID

    View full-size slide

  5. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    5
    MySQL HeatWave - OLTP + OLAP + ML
    Example : Loan Approval

    View full-size slide

  6. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    6
    MySQL HeatWave Lakehouse
    Process and query hundreds of terabytes of data
    in the object store

    View full-size slide

  7. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    7
    MySQL HeatWave Lakehouse
    • Query data across MySQL, the
    object store, or both—using
    standard MySQL commands
    • Up to 400 TB of data—the HeatWave
    cluster scales to 512 nodes
    • Querying the database is as fast as
    querying the object store
    NEW
    Process and query 100’s of TB of data in the object store — in a variety of file formats

    View full-size slide

  8. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    8
    HeatWave Lakehouse
    Query data in object storage
    • Querying in HeatWave
    • Scale to 512 nodes, 512 TB
    • CSV, Parquet, Aurora &
    Redshift exports
    • Fastest load,
    Best price-performance
    Use standard MySQL syntax Combine OLTP data with
    object store data
    • 100% compatible with
    MySQL syntax
    • Use MySQL Autopilot to
    auto-infer schema, estimate
    capacity, load times, and
    generate load scripts
    • Treat data lake data as tables
    • Use select, join, aggregations,
    filters, etc… to combine data
    in OLTP tables with data lake
    tables
    Main benefits
    1 2 3

    View full-size slide

  9. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    9
    InnoDB
    mysql> CREATE TABLE Sensor (date DATE, degree INT) ENGINE=DATALAKE
    ENGINE_ATTRIBUTE = ‘{“dialect”:{“format”:”csv”}, “file”:
    [{“prefix”:”temp_sensor_1.csv”}]}’ SECONDARY_ENGINE=RAPID;
    mysql> ALTER TABLE Sensor SECONDARY_LOAD;
    mysql> SELECT count(*) FROM Sensor, SALES WHERE Sensor.degrees >30
    and Sensor.date=SALES.date;
    Object Store
    Devices Social Voice
    Sensors Events
    Process and query 100’s of TB of data in the object store — in a variety of file formats
    MySQL HeatWave Lakehouse

    View full-size slide

  10. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    10
    MySQL HeatWave can process data from multiple data sources
    e.g. Oracle Golden
    Gate, ...
    OCI Object Storage
    AWS Aurora
    AWS Redshift
    Data can be in a file or other databases → No requirement to have data in MySQL
    https://www.mysql.com/products/mysqlheatwave/lakehouse

    View full-size slide

  11. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    11
    Load performance
    MySQL HeatWave is faster to load data and less expensive
    500 TB TPCH
    HeatWave
    Lakehouse Snowflake Redshift Databricks
    Google
    BigQuery
    Annual Cost $1,742,036 $2,300,160 $1,544,268 $1,822,817 $1,446,900
    Pricing Term PAYG Standard Edition 1 year upfront 1 year reserved 1 year reserved
    Load Time
    (hrs) 4.43
    9.04
    (2x)
    40.86
    (9.2x)
    25.42
    (5.7x)
    38.2
    (8.6x)
    https://www.oracle.com/mysql/heatwave/performance/#heatwave-lakehouse

    View full-size slide

  12. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    12
    Load & Query performance comparison – best in the industry
    500 TB TPCH
    HeatWave
    Lakehouse Snowflake Redshift Databricks
    Google
    BigQuery
    Annual Cost $1,742,036 $2,300,160 $1,544,268 $1,822,817 $1,446,900
    Pricing Term PAYG Standard Edition 1 year upfront 1 year reserved 1 year reserved
    Load Time (hrs) 4.43
    9.04
    (2x slower)
    40.86
    (9.2x slower)
    25.42
    (5.7x slower)
    38.2
    (8.6x slower)
    Query Time 2,150 sec
    39,040 sec
    (18x slower)
    32,715 sec
    (15x slower)
    37,729 sec
    (17x slower)
    76,180 sec
    (35x slower)
    MySQL HeatWave is faster to load & query data and still less expensive
    https://www.oracle.com/mysql/heatwave/performance/#heatwave-lakehouse

    View full-size slide

  13. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    13
    Demo
    xxxxxxxx

    View full-size slide

  14. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    14
    Fully compatible MySQL syntax generated by Autopilot, no human required
    Three simple steps to query data in object store
    1. Run MySQL Autopilot on data in object store
    mysql> CALL sys.heatwave_load(,);
    OUTPUT: DDLs automatically generated
    2. Execute DDLs generated by Autopilot
    mysql> CREATE TABLE `cust1DB`.`Sensor` (date DATE, degree INT)
    -> ENGINE=Lakehouse SECONDARY_ENGINE=Rapid
    -> ENGINE_ATTRIBUTE = ‘{“file”:[{“prefix”:”sensor1-April”, “par”:””}]}’;
    mysql> ALTER TABLE `cust1DB`.`Sensor` SECONDARY_LOAD;
    3. Query across file and table
    mysql> SELECT count(*) FROM Sensor, SALES WHERE Sensor.degrees > 30 AND Sensor.date = SALES.date;

    View full-size slide

  15. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    15
    Summary
    Modernize MySQL & non-MySQL workloads with
    MySQL HeatWave Lakehouse

    View full-size slide

  16. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    16
    1. Designed to process non-MySQL workloads
    2. Best query performance and load performance for data warehouse
    3. Query data in object store and OLTP data in MySQL database
    4. Data in object store remains in object store
    5. MySQL Autopilot for automating data management
    6. HeatWave scales to 512 HeatWave nodes and 1/2 Petabyte data
    Process and query hundreds of terabytes of data in the object store
    MySQL HeatWave Lakehouse

    View full-size slide

  17. Get $300 in credits and try free for 30 days
    Get started with
    MySQL HeatWave
    oracle.com/mysql/free
    Learn more about MySQL HeatWave
    oracle.com/mysql
    Request a guided workshop
    Ask your account manager
    17 Copyright © 2023, Oracle and/or its affiliates

    View full-size slide

  18. Follow us on Social Media
    “Data is the Oxygen of Business”
    18 Copyright © 2023, Oracle and/or its affiliates

    View full-size slide

  19. Merci!
    Q&R
    Olivier Dasini
    MySQL Cloud Principal Solutions Architect EMEA
    [email protected]
    Blogs : www.dasini.net/blog/en
    : www.dasini.net/blog/fr
    Linkedin: www.linkedin.com/in/olivier-dasini
    Twitter : @freshdaz

    View full-size slide

  20. Copyright © 2023, Oracle and/or its affiliates. All rights reserved.
    21
    “Oracle announced MySQL HeatWave with Autopilot last August, which may very well have been the
    single greatest innovation in open source cloud databases in the last 20 years to that point. Now Oracle
    has gone beyond its original unifying of OLTP and OLAP in HeatWave, with MySQL HeatWave ML. Oracle
    is bringing all of the machine learning processing and models inside the database, so that customers not
    only avoid managing ML databases apart from the core database, but also eliminate the hassles of ETL,
    gaining speed, accuracy, and cost-effectiveness in the bargain.”
    “This latest announcement from Oracle is the third major release of MySQL HeatWave in just over 12
    months. Oracle has delivered more cloud database innovations during that timeframe than most cloud
    database vendors have delivered in the last decade. Not only does the in-database HeatWave ML make
    Redshift ML look like yesterday’s tech in terms of engineering, performance and cost, but the latest
    MySQL HeatWave TPC-DS benchmarks demonstrate that Amazon Redshift, Snowflake, Azure Synapse
    and Google BigQuery are all slower and more expensive. It’s rather clear who’s innovating in cloud
    databases and who’s being complacent.”
    Feedback from analysts

    View full-size slide

  21. “HeatWave Lakehouse scales out very well for loading data
    from object storage and for running queries on object store…
    This scale out characteristic of HeatWave Lakehouse for data
    management is key to efficiently process very large amounts of
    data.”
    Henry Tullis
    Leader, Cloud Infrastructure and Engineering
    Deloitte Consulting

    View full-size slide

  22. What are industry analysts saying about MySQL HeatWave
    Lakehouse?
    “MySQL HeatWave demonstrates that
    Lakehouse performance can be
    identical to transaction query
    performance—unheard of and even
    unthinkable.”
    “For HeatWave Lakehouse to deliver record
    performance for both loading data and
    querying data is an unprecedented innovation
    in cloud data services.”
    “The ability of HeatWave to load and query data on such
    a massive number of nodes in parallel is the first in the
    industry.”
    “MySQL HeatWave Lakehouse is not your
    typical analytical database architecture, and
    its design engineering will continue to
    push the competitive market forward.”
    “Data lakehouses are meant to bridge the gap between
    data warehouses and data lakes... MySQL HeatWave
    Lakehouse takes that a step further by making cloud
    object storage a first-class citizen.”
    “Simply put: MySQL HeatWave Lakehouse enables you
    to stay ahead of the competition by taking swift action
    on meaningful business insights.”
    “Organizations looking for the best value in the cloud
    data lakehouse landscape must seriously consider
    MySQL HeatWave Lakehouse.”
    “MySQL HeatWave Lakehouse takes customers
    to a new level of capabilities”
    “MySQL HeatWave Lakehouse can simplify
    the life of data management professionals
    and should improve the customer experience.”
    “In the era of
    AI, the ability
    to process
    data is the
    absolute
    demarcation
    between
    companies
    that are going
    to get
    productivity
    and outcomes
    and those that
    won't…”
    “The performance against the big names is pretty
    incredible you know…when you talk about a highly
    specialized accelerated workload, this is a
    tremendously powerful use case...”

    View full-size slide