Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PM Update: Whats New on Oracle Machine Learning on Autonomous

PM Update: Whats New on Oracle Machine Learning on Autonomous

On this weekly Office Hours for Oracle Machine Learning on Autonomous Database, the OML team gave an update on the many new features of OML, including: OML Notebook updates, updated structure for OML and REST URLs, OML Services support for Cluster ONNX models, support for Italian language on OML Services cognitive text, new filters for modelType and shared attributes. They also showed a new ESA Wiki model for Database 19c that can be used on premises and in the Cloud.

The Oracle Machine Learning product family supports data scientists, analysts, developers, and IT to achieve data science project goals faster while taking full advantage of the Oracle platform.

The Oracle Machine Learning Notebooks offers an easy-to-use, interactive, multi-user, collaborative interface based on Apache Zeppelin notebook technology, and support SQL, PL/SQL, Python and Markdown interpreters. It is available on all Autonomous Database versions and Tiers, including the always-free editions.

OML includes AutoML, which provides automated machine learning algorithm features for algorithm selection, feature selection and model tuning, in addition to a specialized AutoML UI exclusive to the Autonomous Database.

OML Services is also included in Autonomous Database, where you can deploy and manage native in-database OML models as well as ONNX ML models (for classification and regression) built using third-party engines, and can also invoke cognitive text analytics.

Marcos Arancibia

December 07, 2021
Tweet

More Decks by Marcos Arancibia

Other Decks in Technology

Transcript

  1. OML Product Management Update:
    What’s New in Oracle Machine Learning
    OML AskTOM Office Hours
    Marcos Arancibia and Sherry LaMonica
    Product Management, Oracle Machine Learning
    Move the Algorithms; Not the Data!
    Copyright © 2021, Oracle and/or its affiliates.
    This Session will
    be Recorded

    View full-size slide

  2. • OML Notebook updates
    • Updated structure for OML and REST URLs
    • OML Services
    – Clustering for ONNX models
    – Cognitive Text: New support for Italian language
    – New filters for modelType and shared attributes
    • New ESA Wiki model for Database 19c
    What’s New in Oracle Machine Learning?
    Copyright © 2021, Oracle and/or its affiliates
    2

    View full-size slide

  3. OML Notebooks
    3 Copyright © 2021, Oracle and/or its affiliates

    View full-size slide

  4. New additions and updates
    • Upgraded to Zeppelin 0.9
    • Import Jupyter notebooks (*ipynb)
    • 79 template example notebooks
    • 40 OML4Py notebooks
    • 39 OML4SQL notebooks (4 are 21c-only)
    • 15 new notebooks
    • 3 OML XGBoost + MSET templates for 21c
    • Data loading from GitHub mechanism
    highlighted (OML Run-me-first)
    • Included details of available algorithm
    settings options in applicable notebooks
    • Added demos using the OML4SQL to score
    data and display prediction details to
    OML4Py notebooks
    OML Notebooks
    Copyright © 2021, Oracle and/or its affiliates
    4

    View full-size slide

  5. OML and RESTful URLs
    5 Copyright © 2021, Oracle and/or its affiliates

    View full-size slide

  6. • Base URL now includes tenancy ID and database name, and works for OML Notebooks too
    • Token request no longer needs Tenancy OCID nor Database name in the PATH
    • Root domain is now oraclecloudapps.com
    CURRENT
    NEW
    OML4Py and OML Services REST APIs
    Copyright © 2021, Oracle and/or its affiliates.
    6
    https://adb.us-sanjose-1.oraclecloud.com /tenant/ocid1.tenancy.oc1..aaaaa…/database/omldb
    https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com
    Database
    name
    root domain
    Datacenter
    region
    Tenancy ID (not
    OCID nor name)
    New URL structure
    Database
    name
    root domain
    Datacenter
    region
    Tenancy OCID
    |---------section required for Token acquisition-------|
    Same for Token acquisition

    View full-size slide

  7. New style URL
    omlserver/omlusers/api/oauth2/v1/token
    • omlserver = OML cloud service location URL
    for Autonomous Database, for example:
    https://qtraya2braestch-omldb.adb.us-
    sanjose-1.oraclecloudapps.com
    Old style URL
    omlserver/omlusers/tenants/tenant/datab
    ases/database/api/oauth2/v1/token
    • omlserver = OML cloud service location URL
    for Autonomous Database, for example :
    https://adb.us-sanjose-1.oraclecloud.com
    • tenant = Oracle Autonomous Database
    Tenancy OCID, in the form of:
    OCID1.TENANCY.OC1..AAAAAAAAFCUE4……
    • database = Oracle Autonomous Database
    database name, for example: OMLDB
    REST API Authentication
    Copyright © 2021, Oracle and/or its affiliates.
    7
    Standard call for all OML REST API token endpoints

    View full-size slide

  8. Where can I find the URLs that correspond to my tenancy?
    Location of REST URLs
    From your Oracle Autonomous Database
    instance:
    1. Click Service Console
    2. Click Development
    3. Scroll down to Oracle Machine Learning
    RESTful Services and copy the URL
    Oracle Machine Learning RESTful URLs
    Copyright © 2021, Oracle and/or its affiliates
    8
    https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlusers/
    https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/oml/
    https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlmod/
    https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/ords/

    View full-size slide

  9. Initial call to get a token and be able to access OML REST endpoints
    To request a token for accessing OML REST API endpoints, you need a valid user and password for your
    Oracle Autonomous Database with the proper grants as an OML Developer from the OML
    Administrator.
    For the following REST call, we will consider:
    omlserver=https://tenancy id-database.adb-region.oraclecloudapps.com
    $ curl –I \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    –d '{"grant_type":"password", "username": "YourOMLuser", "password": "YourOMLpass"}’\
    “omlserver/omlusers/api/oauth2/v1/token"
    Token acquisition
    Copyright © 2021, Oracle and/or its affiliates
    9

    View full-size slide

  10. Call to get the Open API description for the current OML Services
    Open API description
    To review the Open API specification for the OML Services REST end points, you need to pass a valid
    token.
    For the following REST call, we will consider:
    OML_URL = omlserver/omlmod, and remember to provide the full Token after "Bearer"
    $ curl --location --request GET 'OML_URL/v1/api' \
    --header 'Authorization: Bearer eyJhbGciOiJSUzI1NiJ9.....=='
    Send a Request – OML Services
    Copyright © 2021, Oracle and/or its affiliates
    10
    This is the token

    View full-size slide

  11. Call to get the Open API description for the current OML4Py REST services
    Open API description
    To review the Open API specification for the OML4Py REST end points, you need to pass a valid token.
    For the following REST call, we will consider:
    OML_URL = omlserver/oml, and remember to provide the full Token after "Bearer"
    $ curl --location --request GET 'OML_URL/api/py-scripts/v1 ' \
    --header 'Authorization: Bearer eyJhbGciOiJSUzI1NiJ9.....=='
    Send a Request – OML4Py
    Copyright © 2021, Oracle and/or its affiliates
    11
    This is the token

    View full-size slide

  12. Link to access OML in Autonomous Database
    Original Link to access OML User Interface today from a bookmarked link:
    https://adb.us-sanjose-1.oraclecloud.com/omlusers/login.html
    Plus the necessary options:
    ?tenant=OCID1.TENANCY.OC1..AAAAA....&database=OMLDB&redirect_uri=https://adb.us-sanjose-
    1.oraclecloud.com/omlusers/api/oauth2/v1/login
    New Style URL to access OML User Interface:
    https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/oml
    New Style URL to access OML User Interface user administration:
    https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlusers
    Oracle Machine Learning
    Copyright © 2021, Oracle and/or its affiliates
    12

    View full-size slide

  13. OML Services
    13 Copyright © 2021, Oracle and/or its affiliates

    View full-size slide

  14. Example of clustering two attributes from the
    Breastcancer Dataset
    Visualization of the original Data with the target
    in different colors
    Copyright © 2021, Oracle and/or its affiliates
    14
    ONNX Clustering is now supported
    OML Services

    View full-size slide

  15. Create the clustering model using only two input
    attributes for this small test, and show the
    predictions
    Show the Cluster Centroids as an example
    Export the SciKit Learn cluster model to ONNX
    format into Zip file for OML Services which
    includes the .onnx file and the metadata.json)
    cURL example of the OML Services scoring
    $ curl -L -X POST 'https://qtraya2braestch-omldb.adb.us-sanjose-
    1.oraclecloudapps.com/omlmod/v1/deployment/SKLearn_kmeans_BC/score' -H
    'Content-Type: application/json' -H 'Authorization: Bearer eyJ………iOiJ'-d
    '{"inputRecords":[{ "X": [[10.38, 17.77]]}] }'|jq
    Copyright © 2021, Oracle and/or its affiliates
    15
    ONNX Clustering is now supported
    OML Services

    View full-size slide

  16. Cognitive text capability for Italian Language
    Returns most relevant topics and weights:
    OML Services
    Copyright © 2021, Oracle and/or its affiliates
    16
    $ curl -X POST
    "${omlserver}/omlmod/v1/cognitive-text/topics"
    \
    --header 'Content-Type: application/json’ \
    --header "Authorization: Bearer ${token}" \
    --data ‘{
    "topN":5,
    "language": "ITALIAN",
    "textList":["Con Oracle Machine
    Learning, Oracle sposta gli algoritmi sui dati.
    Oracle esegue …… l'automazione richieste dai
    progetti di data science su scala aziendale,
    sia on-premise che nel cloud."]
    }’
    Blog: OML Services Cognitive Text – Italian Language
    https://blogs.oracle.com/machinelearning/post/oml-services-cognitive-
    text---italian-language-now-available
    Example: Topic Discovery
    "topicResults": [
    {
    "topic": "Oracle Corporation",
    "weight": 0.23331640964885378},
    {
    "topic": "Oracle Database",
    "weight": 0.20443284083978977},
    {
    "topic": "Big data",
    "weight": 0.16381463223223036},
    {
    "topic": "Base di conoscenza",
    "weight": 0.13233125000617454},
    {
    "topic": "Apprendimento automatico",
    "weight": 0.13091866812720565}
    ]

    View full-size slide

  17. New filters for model type and shared model attributes
    Filter by ONNX models
    $ curl -X GET --header "Authorization: Bearer $token"
    "${omlserver}/omlmod/v1/models?modelType=ONNX“
    Filter by shared models
    $ curl -X GET --header "Authorization: Bearer $token"
    "${omlserver}/omlmod/v1/models?shared=true“
    OML Services
    Copyright © 2021, Oracle and/or its affiliates
    17

    View full-size slide

  18. ESA Wiki Model
    18 Copyright © 2021, Oracle and/or its affiliates

    View full-size slide

  19. Built under Database 19c
    • ESA is a pre-built model for feature extraction of explicit features in a knowledge base
    – Maps words to relevant concepts
    – Wikipedia is a good source for ESA - comprehensive knowledge base
    • The new ESA model was built using millions of Wikipedia articles available as of July 1,
    2021
    – Topics reduced to about 161,000
    – Users can also create their own custom, domain-specific ESA models
    Blog: New Wiki ESA model available for 19c
    https://blogs.oracle.com/machinelearning/post/wiki-esa-model-available-for-database-19c
    New ESA Wiki Model
    Copyright © 2021, Oracle and/or its affiliates
    19

    View full-size slide

  20. (shown only for general interest – many people/companies use their own custom processing)
    Load Wikipedia dumps
    Wikipedia dumps are
    compressed XML files.
    Individual pages are tagged as
    . The contents of the
    pages is tagged as .
    Contents inside contain
    plenty of Wikipedia-specific
    information that is not visible
    and various brackets are
    present.
    Page Filtering
    To collect the pages that
    describe concepts and more
    general knowledge about
    various subjects, there is a lot
    of: parsing and stripping
    HTML tags from pages, partial
    tokenization, special
    characters removals, dropping
    of words with special
    characters or numbers and
    more. The outcome of
    Wikipedia page processing is
    tab-separated files.
    Category & Article
    DocStore from Oracle Labs is
    used to remove non-usable
    information and to split the
    Wikipedia XML dumps into
    individual entities including
    article and category pages
    (ignoring other types of
    pages).
    The outcome of DocStore
    processing is text with HTML
    tags.
    ESA Model Build
    We calculate the number of
    incoming links for every
    page using cross-page links.
    ESA model is reduced to retain
    the pages that are more
    general and describe
    concepts, filtering out
    References, References and
    links, Sources, Further reading
    etc.. The final ESA model is
    built with a limit of 200,000
    Features and 1,000 Top
    Features retained, resulting in
    some 27 mi records and 800
    MB in size (current version)
    Steps used by the Oracle Team (internally) to Process the Wikipedia data
    Copyright © 2021, Oracle and/or its affiliates
    20
    XML Article
    pages
    Category
    pages
    TSV
    Pages
    TSV
    pages
    x-links
    TSV
    pages by
    category
    OML in-DB ESA Wiki Model

    View full-size slide

  21. Download from https://oss.oracle.com/machine-learning/
    Where can I download the new ESA Wiki model?
    Copyright © 2021, Oracle and/or its affiliates
    21
    Blog: New Wiki ESA model available for 19c
    https://blogs.oracle.com/machinelearning/post/wiki-esa-model-available-for-database-19c
    For complete examples, search OML Notebooks Template Examples for "ESA".

    View full-size slide

  22. Quick demo
    22 Copyright © 2021, Oracle and/or its affiliates
    • New URL to access OML
    • OML Services on Postman
    • OML Services new URL
    • OML Services new modelType filter
    • OML Services new Clustering ONNX model support
    • OML4Py REST APIs new URL

    View full-size slide

  23. Q & A
    Copyright © 2021, Oracle and/or its affiliates
    23

    View full-size slide

  24. Thank you!
    Copyright © 2021, Oracle and/or its affiliates.
    24

    View full-size slide