Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OML usage highlight: Live Demo of Oracle Stream Analytics with OML AutoML UI and OML Services plus Data Mesh

OML usage highlight: Live Demo of Oracle Stream Analytics with OML AutoML UI and OML Services plus Data Mesh

On this weekly Office Hours for Oracle Machine Learning on Autonomous Database, Hadi Javaherian, Senior AppDev and Integration Platform Specialist explained all the benefits of the integration between Oracle Stream Analytics and Oracle Machine Learning Services. He shared how this is directly related to the concept of Data Mesh at a high level, and also showed how easy it is for a user to create models using OML AutoML UI and deploy them in seconds to OML Services, which then are made available immediately to Oracle Stream Analytics for real time scoring.

The Oracle Machine Learning product family supports data scientists, analysts, developers, and IT to achieve data science project goals faster while taking full advantage of the Oracle platform.

The Oracle Machine Learning Notebooks offers an easy-to-use, interactive, multi-user, collaborative interface based on Apache Zeppelin notebook technology, and support SQL, PL/SQL, Python and Markdown interpreters. It is available on all Autonomous Database versions and Tiers, including the always-free editions.

OML includes AutoML, which provides automated machine learning algorithm features for algorithm selection, feature selection and model tuning, in addition to a specialized AutoML UI exclusive to the Autonomous Database.

OML Services is also included in Autonomous Database, where you can deploy and manage native in-database OML models as well as ONNX ML models (for classification and regression) built using third-party engines, and can also invoke cognitive text analytics.

Marcos Arancibia

September 21, 2021
Tweet

More Decks by Marcos Arancibia

Other Decks in Technology

Transcript

  1. OML usage highlight:
    Live Demo of Oracle Stream Analytics with OML AutoML UI
    and OML Services
    OML AskTOM Office Hours
    Hadi Javaherian
    Solution Engineer, App Dev & Integration
    Supported by Marcos Arancibia, Mark Hornick
    and Sherry LaMonica
    Product Management, Oracle Machine Learning
    Move the Algorithms; Not the Data!
    Copyright © 2021, Oracle and/or its affiliates.
    This Session will
    be Recorded

    View Slide

  2. Upcoming Sessions
    Live Demo of Oracle Stream Analytics with OML AutoML UI and
    OML Services
    Q&A
    Topics for today
    Copyright © 2021, Oracle and/or its affiliates
    2

    View Slide

  3. November 9 2021 08:00 AM Pacific
    OML Usage Highlight: Leveraging OML algorithms in Retail Science platform
    November 2 2021 08:00 AM Pacific
    Weekly Office Hours: OML on Autonomous Database - Ask & Learn
    October 12 2021 08:00 AM Pacific
    OML feature highlight: Time Series analysis with Oracle Machine Learning
    October 5 2021 08:00 AM Pacific
    OML4Py: Using third-party Python packages from Python, SQL and REST
    September 28 2021 08:00 AM Pacific
    Weekly Office Hours: OML on Autonomous Database - Ask & Learn
    (OML4Py updated Hands-on-Lab)
    Upcoming Sessions
    Copyright © 2021, Oracle and/or its affiliates
    3

    View Slide

  4. GGSA and AutoML UI
    Low Code and No Code
    Data Mesh in Practice
    Hadi Javaherian
    Solution Engineer, App Dev & Integration
    Copyright © 2021 Oracle and/or its affiliates.
    September 2021

    View Slide

  5. The following is intended to outline our general product direction. It is intended for information purposes
    only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
    or functionality, and should not be relied upon in making purchasing decisions. The development,
    release, timing, and pricing of any features or functionality described for Oracle’s products may change
    and remains at the sole discretion of Oracle Corporation.
    Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
    prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
    discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
    Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
    under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
    at http://www.oracle.com/investor. All information in this presentation is current as of September 2019
    and Oracle undertakes no duty to update any statement in light of new information or future events.
    Safe Harbor
    Copyright © 2021 Oracle and/or its affiliates.

    View Slide

  6. Agenda
    Copyright © 2021 Oracle and/or its affiliates.
    Data Product
    Data Mesh & Principles
    GoldenGate Stream Analytics
    Stream Analytics – AutoML UI Integration
    Demo
    Q/A
    1
    2
    3
    4
    5
    6

    View Slide

  7. Data Fabric | Stream Processing | Data Mesh
    Copyright © 2021, Oracle and/or its affiliates
    7
    Oracle is a Leader in this Space

    View Slide

  8. DATA
    CAPITAL
    Copyright © 2021 Oracle and/or its affiliates.

    View Slide

  9. Copyright © 2021, Oracle and/or its affiliates
    9
    Business
    Needs
    User
    Needs
    Data
    Attributes
    Data
    Product
    Value-focused, Data Product Thinking
    Trusted, Polyglot Streams
    stream
    processing
    events
    Apps
    / IoT
    trans-
    actions
    any
    pipes
    ACID
    trusted
    data
    Attributes of a
    Trusted Data Mesh
    Decentralized, Multi-Cloud Mesh
    Upgrade legacy enterprise data architecture, monolithic
    integration tools and outmoded batch processes
    Enterprise Data Ledgers
    events
    trans-
    actions
    Apps
    / IoT
    ACID

    View Slide

  10. - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
    10
    Page
    Data Mesh Attribute
    DATA PRODUCTS
    Products of any kind, from raw commodities to items at your local
    store are produced as assets of value, intended to be consumed
    and with a specific ‘job to be done.’
    Data products can take a variety of forms, depending on the
    business domain or problem to be solved, and may include:
    • Analytics – historic/real-time reports & dashboards
    • Data Sets – data collections in different shapes/formats
    • Models – domain objects, data models, ML features
    • Algorithms – ML models, scoring, business rules
    • Data Services & APIs – docs, payloads, topics, REST APIs…
    A data product is created for consumers, requiring tracking of
    additional attributes such as:
    • Stakeholder Map – who creates and consumes this product?
    • Packaging, Documentation – how is it consumed?
    • Purpose & Value – implicit/explicit value? depreciation?
    • Quality, Consistency – KPIs and SLAs of usage?
    • Provenance, Lifecycle & Governance – trust & explainability?
    Data
    Products
    Data
    Assets
    Business Data
    Digital Noise
    higher value & more controls

    View Slide

  11. A Decentralized Mesh
    Copyright © 2021, Oracle and/or its affiliates
    11
    Edge Gateways
    Edge
    Multi-Cloud
    Enterprise
    Applications
    Stream Analytics
    Data Mesh
    Self-Service GUI
    filter
    λ
    dist.
    ingest
    load
    dist.
    ingest
    ingest
    capture
    dist.
    capture
    dist.
    capture
    capture
    replicat
    join
    load
    capture
    dist.
    capture Exadata
    [email protected]
    SaaS
    Data Lake
    IaaS
    Analytics

    View Slide

  12. Data Mesh Use Case
    STREAMING DATA PIPELINES
    Data / Event
    Services
    SQL
    Access
    Notebooks
    /ML
    Marts
    Data
    Warehouse
    Data
    Visualization
    Raw Data
    Zone
    Curated
    Data
    Prepared
    Data
    Master
    Data
    DATA MESH
    Once ingested into the analytic data stores, there is usually a
    need for ‘data pipelines’ to prepare and transform the data
    across different data stages or data zones. This is a process of
    data refinement often needed for the downstream analytic data
    products.
    A Data Mesh can provide an independently governed data
    pipeline layer that works with the analytic data stores, providing
    the following core services:
    • Self-service data discovery and data preparation
    • Governance of data resources across domains
    • Data transformation into required data product formats
    • Eg; streaming ETL
    • Data verification, by policy, to assure consistency
    These data pipelines should be capable to work across different
    physical data stores (such as marts, warehouses, lakes etc) or as
    a “pushdown data stream” within analytic data platforms that
    support streaming data, such as Apache Spark and other data
    lake house technologies.
    Figure 1: a data mesh can create, execute and govern streaming pipelines within a Data Lake
    - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
    12
    Page

    View Slide

  13. GoldenGate Stream Analytics
    Ingest Events Select Processing Patterns Build Event Pipelines Serve Data Downstream
    100+ supported sources from OCI-
    GoldenGate, OCI-Streaming, Oracle
    Integration Cloud and Oracle IoT Cloud
    Rich set of pre-built patterns will
    dramatically improve developer
    efficiency and time-to-value
    Easily leverage geo-fencing,
    machine-learning, and other
    reference data within the stream
    Data can be delivered out to Kafka,
    databases, or easily staged for
    external ETL jobs
    OCI Streaming
    Copyright © 2021 Oracle and/or its affiliates.
    Ø OCI Object storage
    Ø OCI Streaming Service
    Ø Oracle ATP
    Ø Oracle Autonomous DW
    Ø Oracle Database

    View Slide

  14. Tangible, Trusted Data Products:
    Copyright © 2021, Oracle and/or its affiliates
    14
    DB or App Events
    json
    avro
    AutoML
    Curated Change Streams No Code Data Pipelines Production ML Scoring / Predictions
    Real-time Dashboards / Alerting Time Series & Spatial Analytics Streaming Data Services

    View Slide

  15. AutoML UI Experiment
    15
    OML Simplified
    • Identify in-DB
    algorithm with
    better Model
    Quality
    • Find best
    algorithm faster
    than exhaustive
    search
    • Identify right
    sample size for
    training data
    • Adjust sample
    for unbalanced
    data
    • De-noise data
    • Reduce features
    by identifying
    most predictive
    • Improve
    accuracy and
    performance
    • Improves model
    accuracy
    • Automated
    tuning of
    hyperparameters
    • Avoid manual or
    exhaustive search
    techniques
    Auto Algorithm Selection Adaptive Sampling Auto Feature Selection Auto Model Tuning
    DB
    Table
    ML
    Model
    Copyright © 2021 Oracle and/or its affiliates.
    Pipeline

    View Slide

  16. Real-time Data
    Feed
    Data
    Pre-Processing
    Oracle Machine
    Learning Service
    Post ML Scoring
    Processing
    Stages
    AutoML UI
    (Create Experiment)
    Stream Stream
    Stream
    OML Services
    Stream
    Processed Data
    Stream
    to
    Target
    GGSA and AutoML UI Integration
    Generated ML
    Model
    Database
    Deploy ML
    Model
    Initial Training
    Dataset
    Enriched Training
    Dataset
    Stream Send to
    Autonomous Target
    Scoring Pipeline
    Training Pipeline

    View Slide

  17. Data Mesh with Stream Analytics and AutoML UI
    Copyright © 2021, Oracle and/or its affiliates
    17
    Edge
    Object
    Storage
    Data Mesh
    Self-Service GUI
    Upload
    ingest
    load
    dist.
    ingest
    capture
    dist.
    capture
    replicat
    join
    load
    Data Lake
    Analytics
    Maintenance
    AutoML UI
    Enriched Data
    Scoring Pipeline
    Maintenance Record
    Machine Details
    M
    aintenance
    Record
    Ø Apache Druid
    Ø Apache HDFS
    Ø Apache Hive
    Ø Apache Ignite
    Ø Apache Kafka
    Ø AWS S3
    Ø Azure Datal Lake Gen2
    Ø Block Storage
    Ø Confluent Kafka
    Ø Elasticsearch
    Ø MongoDB
    Ø OCI Object storage
    Ø OCI Streaming Service
    Ø Oracle ATP
    Ø Oracle Autonomous DW
    Ø Oracle Database
    Data
    Consumer
    Training Pipeline
    Events
    M
    aintenance Data
    Send
    M
    aintenance
    Today
    Data
    Ledger
    Training Pipeline

    View Slide

  18. DEMO
    Copyright © 2021 Oracle and/or its affiliates.

    View Slide

  19. Q & A
    Copyright © 2021, Oracle and/or its affiliates
    19

    View Slide

  20. Thank You
    Questions…

    View Slide

  21. - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
    21
    Page
    Case Study
    SAILGP – STREAMING ANALYTICS
    People, Process and Methods:
    Data Product Focus P
    Technical Architecture Attributes:
    Distributed Architecture P
    Event Driven Ledgers P
    ACID Support ingest to DW
    Stream Oriented P
    Analytic Data Focus P
    Operational Data Focus P
    Physical & Logical Mesh physical data
    GoldenGate Use Case
    stream analytics, real-time
    event correlation, ETL,
    analysis and ingest to DW
    SailGP runs one of the most
    exciting race venues in the world,
    with high tech and high-speed sail
    boats. Live race data and analytics
    are provided within milliseconds
    using data mesh tech.27
    Distributed edge technology links
    race boat, support boat and race
    helicopter data into streaming
    pipelines.
    Telemetry data is streamed into
    nearby clouds for real-time ETL,
    analytics and ingest to cloud data
    warehouse.
    Data mesh tech uses GoldenGate
    and Kafka (Oracle Streaming).
    Stream analytics are used in real-
    time on race day to assist with
    support crews and broadcast
    networks.

    View Slide

  22. Data Mesh
    Data
    Integration
    Meta-Catalog Microservices Messaging Data Lake Distributed DW
    People, Process and Methods:
    Data Product Focus ˜ ˜ ˜   œ œ
    Technical Architecture Attributes:
    Distributed
    Architecture
    ˜  œ ˜ ˜  œ
    Event Driven Ledgers ˜ ™  ˜ ˜  
    ACID Support ˜ ˜ ™ ™ œ  ˜
    Stream Oriented ˜  ™ ™   
    Analytic Data Focus ˜ ˜ ˜ ™ ™ ˜ ˜
    Operational Data
    Focus
    ˜  ˜ ˜ ˜ ™ ™
    Physical & Logical
    Mesh
    ˜ ˜ ™  œ œ 
    - Enterprise Data Mesh: Solutions, Use Cases and Case Studies
    22
    Page
    Data Mesh
    CASE STUDY CRITERIA
    There is no single ‘perfect’ example of a Data Mesh.
    Other software development and data architecture patterns, or technology categories exist and
    there remains substantial overlap among the most common concepts like Data Fabrics,
    Microservices Service Mesh, and Data Lake Houses.
    For this document, we are considering Data Mesh as a type of Data Fabric. Case Studies should
    have ‘significant solution focus’ using technology with the following attributes:
    • Data Products – driving cultural and process changes that affect cross-organizational data
    domains, and institutionalize strong management practices around data assets
    • Distributed Architecture – decentralized, microservices-based software architecture patterns
    • Event Driven Ledgers – durable running log of events to drive cross-domain integrations
    • ACID Support – for polyglot streams, empowering correct and trusted data transactions
    • Stream-Oriented – data processing on ‘data in motion’ to drive solution outcomes
    • Analytic Data Focus – data pipelines or data products in the analytics domain (eg; OLAP)
    • Operational Data Focus – solution focus on operational data outcomes (eg; OLTP)
    • Physical & Logical Mesh – data is both physically and logically ‘meshed’ together

    View Slide

  23. Transaction Outbox for the Whole Enterprise
    Copyright © 2021 Oracle and/or its affiliates.
    DB2/z
    Replication of Real-time Ledger
    • Transactions
    • Events
    GoldenGate Stream Analytics
    ETL
    &ML
    Object
    Storage
    Relational
    Non-
    Relational
    Apps
    DBMS
    Cloud
    Big Data
    NoSQL
    Streams
    Data Products
    Source of Truth

    View Slide

  24. Trusted Data Mesh: Outcomes & Benefits
    Copyright © 2021, Oracle and/or its affiliates
    Agile, DataOps and CI/CD for Data
    • Mesh comes to data services
    Distributed (Big) Data Lake
    • Fast data from anywhere
    Polyglot Data Streams
    • Trusted, consistent data
    • All payloads and serializations
    Uninterrupted Continuity
    • >99.999% up-time SLAs
    Multi-Cloud Data Liquidity
    • Data capital is fast data
    LoB Powered Data Products
    • Modern self-service data fabric
    Edge, Location-based Data Services
    • Correlate IRL device events
    Predictive Analytics
    • Data monetization,
    new ‘data services’ for sale
    Benefits
    Data-Driven
    Business Transformation IT Modernization
    Reduce Costs for
    Data Operations
    Faster Business
    Innovation Cycles

    View Slide

  25. What Next?
    Copyright © 2021, Oracle and/or its affiliates
    Ask Oracle for Tech Paper or a demo!
    Oracle #1 in Data Fabric Strategy
    GoldenGate YouTube | Data Mesh:
    Free Trial of GoldenGate Streaming:
    https://www.youtube.com/playlist?list=PL
    bqmhpwYrlZJ-583p3KQGDAd6038i1ywe
    https://cloudmarketplace.oracle.com/marke
    tplace/en_US/listing/70961838
    https://blogs.oracle.com/dataintegration/oracl
    e_forresterwave_datafabric_2020?xd_co_f=66
    bcf41f-e285-4ccc-a5b5-1c790cab0db0
    Customer Success
    25

    View Slide