Slide 1

Slide 1 text

OML Product Management Update: What’s New in Oracle Machine Learning OML AskTOM Office Hours Marcos Arancibia and Sherry LaMonica Product Management, Oracle Machine Learning Move the Algorithms; Not the Data! Copyright © 2021, Oracle and/or its affiliates. This Session will be Recorded

Slide 2

Slide 2 text

• OML Notebook updates • Updated structure for OML and REST URLs • OML Services – Clustering for ONNX models – Cognitive Text: New support for Italian language – New filters for modelType and shared attributes • New ESA Wiki model for Database 19c What’s New in Oracle Machine Learning? Copyright © 2021, Oracle and/or its affiliates 2

Slide 3

Slide 3 text

OML Notebooks 3 Copyright © 2021, Oracle and/or its affiliates

Slide 4

Slide 4 text

New additions and updates • Upgraded to Zeppelin 0.9 • Import Jupyter notebooks (*ipynb) • 79 template example notebooks • 40 OML4Py notebooks • 39 OML4SQL notebooks (4 are 21c-only) • 15 new notebooks • 3 OML XGBoost + MSET templates for 21c • Data loading from GitHub mechanism highlighted (OML Run-me-first) • Included details of available algorithm settings options in applicable notebooks • Added demos using the OML4SQL to score data and display prediction details to OML4Py notebooks OML Notebooks Copyright © 2021, Oracle and/or its affiliates 4

Slide 5

Slide 5 text

OML and RESTful URLs 5 Copyright © 2021, Oracle and/or its affiliates

Slide 6

Slide 6 text

• Base URL now includes tenancy ID and database name, and works for OML Notebooks too • Token request no longer needs Tenancy OCID nor Database name in the PATH • Root domain is now oraclecloudapps.com CURRENT NEW OML4Py and OML Services REST APIs Copyright © 2021, Oracle and/or its affiliates. 6 https://adb.us-sanjose-1.oraclecloud.com /tenant/ocid1.tenancy.oc1..aaaaa…/database/omldb https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com Database name root domain Datacenter region Tenancy ID (not OCID nor name) New URL structure Database name root domain Datacenter region Tenancy OCID |---------section required for Token acquisition-------| Same for Token acquisition

Slide 7

Slide 7 text

New style URL omlserver/omlusers/api/oauth2/v1/token • omlserver = OML cloud service location URL for Autonomous Database, for example: https://qtraya2braestch-omldb.adb.us- sanjose-1.oraclecloudapps.com Old style URL omlserver/omlusers/tenants/tenant/datab ases/database/api/oauth2/v1/token • omlserver = OML cloud service location URL for Autonomous Database, for example : https://adb.us-sanjose-1.oraclecloud.com • tenant = Oracle Autonomous Database Tenancy OCID, in the form of: OCID1.TENANCY.OC1..AAAAAAAAFCUE4…… • database = Oracle Autonomous Database database name, for example: OMLDB REST API Authentication Copyright © 2021, Oracle and/or its affiliates. 7 Standard call for all OML REST API token endpoints

Slide 8

Slide 8 text

Where can I find the URLs that correspond to my tenancy? Location of REST URLs From your Oracle Autonomous Database instance: 1. Click Service Console 2. Click Development 3. Scroll down to Oracle Machine Learning RESTful Services and copy the URL Oracle Machine Learning RESTful URLs Copyright © 2021, Oracle and/or its affiliates 8 https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlusers/ https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/oml/ https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlmod/ https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/ords/

Slide 9

Slide 9 text

Initial call to get a token and be able to access OML REST endpoints To request a token for accessing OML REST API endpoints, you need a valid user and password for your Oracle Autonomous Database with the proper grants as an OML Developer from the OML Administrator. For the following REST call, we will consider: omlserver=https://tenancy id-database.adb-region.oraclecloudapps.com $ curl –I \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ –d '{"grant_type":"password", "username": "YourOMLuser", "password": "YourOMLpass"}’\ “omlserver/omlusers/api/oauth2/v1/token" Token acquisition Copyright © 2021, Oracle and/or its affiliates 9

Slide 10

Slide 10 text

Call to get the Open API description for the current OML Services Open API description To review the Open API specification for the OML Services REST end points, you need to pass a valid token. For the following REST call, we will consider: OML_URL = omlserver/omlmod, and remember to provide the full Token after "Bearer" $ curl --location --request GET 'OML_URL/v1/api' \ --header 'Authorization: Bearer eyJhbGciOiJSUzI1NiJ9.....==' Send a Request – OML Services Copyright © 2021, Oracle and/or its affiliates 10 This is the token

Slide 11

Slide 11 text

Call to get the Open API description for the current OML4Py REST services Open API description To review the Open API specification for the OML4Py REST end points, you need to pass a valid token. For the following REST call, we will consider: OML_URL = omlserver/oml, and remember to provide the full Token after "Bearer" $ curl --location --request GET 'OML_URL/api/py-scripts/v1 ' \ --header 'Authorization: Bearer eyJhbGciOiJSUzI1NiJ9.....==' Send a Request – OML4Py Copyright © 2021, Oracle and/or its affiliates 11 This is the token

Slide 12

Slide 12 text

Link to access OML in Autonomous Database Original Link to access OML User Interface today from a bookmarked link: https://adb.us-sanjose-1.oraclecloud.com/omlusers/login.html Plus the necessary options: ?tenant=OCID1.TENANCY.OC1..AAAAA....&database=OMLDB&redirect_uri=https://adb.us-sanjose- 1.oraclecloud.com/omlusers/api/oauth2/v1/login New Style URL to access OML User Interface: https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/oml New Style URL to access OML User Interface user administration: https://qtraya2braestch-omldb.adb.us-sanjose-1.oraclecloudapps.com/omlusers Oracle Machine Learning Copyright © 2021, Oracle and/or its affiliates 12

Slide 13

Slide 13 text

OML Services 13 Copyright © 2021, Oracle and/or its affiliates

Slide 14

Slide 14 text

Example of clustering two attributes from the Breastcancer Dataset Visualization of the original Data with the target in different colors Copyright © 2021, Oracle and/or its affiliates 14 ONNX Clustering is now supported OML Services

Slide 15

Slide 15 text

Create the clustering model using only two input attributes for this small test, and show the predictions Show the Cluster Centroids as an example Export the SciKit Learn cluster model to ONNX format into Zip file for OML Services which includes the .onnx file and the metadata.json) cURL example of the OML Services scoring $ curl -L -X POST 'https://qtraya2braestch-omldb.adb.us-sanjose- 1.oraclecloudapps.com/omlmod/v1/deployment/SKLearn_kmeans_BC/score' -H 'Content-Type: application/json' -H 'Authorization: Bearer eyJ………iOiJ'-d '{"inputRecords":[{ "X": [[10.38, 17.77]]}] }'|jq Copyright © 2021, Oracle and/or its affiliates 15 ONNX Clustering is now supported OML Services

Slide 16

Slide 16 text

Cognitive text capability for Italian Language Returns most relevant topics and weights: OML Services Copyright © 2021, Oracle and/or its affiliates 16 $ curl -X POST "${omlserver}/omlmod/v1/cognitive-text/topics" \ --header 'Content-Type: application/json’ \ --header "Authorization: Bearer ${token}" \ --data ‘{ "topN":5, "language": "ITALIAN", "textList":["Con Oracle Machine Learning, Oracle sposta gli algoritmi sui dati. Oracle esegue …… l'automazione richieste dai progetti di data science su scala aziendale, sia on-premise che nel cloud."] }’ Blog: OML Services Cognitive Text – Italian Language https://blogs.oracle.com/machinelearning/post/oml-services-cognitive- text---italian-language-now-available Example: Topic Discovery "topicResults": [ { "topic": "Oracle Corporation", "weight": 0.23331640964885378}, { "topic": "Oracle Database", "weight": 0.20443284083978977}, { "topic": "Big data", "weight": 0.16381463223223036}, { "topic": "Base di conoscenza", "weight": 0.13233125000617454}, { "topic": "Apprendimento automatico", "weight": 0.13091866812720565} ]

Slide 17

Slide 17 text

New filters for model type and shared model attributes Filter by ONNX models $ curl -X GET --header "Authorization: Bearer $token" "${omlserver}/omlmod/v1/models?modelType=ONNX“ Filter by shared models $ curl -X GET --header "Authorization: Bearer $token" "${omlserver}/omlmod/v1/models?shared=true“ OML Services Copyright © 2021, Oracle and/or its affiliates 17

Slide 18

Slide 18 text

ESA Wiki Model 18 Copyright © 2021, Oracle and/or its affiliates

Slide 19

Slide 19 text

Built under Database 19c • ESA is a pre-built model for feature extraction of explicit features in a knowledge base – Maps words to relevant concepts – Wikipedia is a good source for ESA - comprehensive knowledge base • The new ESA model was built using millions of Wikipedia articles available as of July 1, 2021 – Topics reduced to about 161,000 – Users can also create their own custom, domain-specific ESA models Blog: New Wiki ESA model available for 19c https://blogs.oracle.com/machinelearning/post/wiki-esa-model-available-for-database-19c New ESA Wiki Model Copyright © 2021, Oracle and/or its affiliates 19

Slide 20

Slide 20 text

(shown only for general interest – many people/companies use their own custom processing) Load Wikipedia dumps Wikipedia dumps are compressed XML files. Individual pages are tagged as . The contents of the pages is tagged as . Contents inside contain plenty of Wikipedia-specific information that is not visible and various brackets are present. Page Filtering To collect the pages that describe concepts and more general knowledge about various subjects, there is a lot of: parsing and stripping HTML tags from pages, partial tokenization, special characters removals, dropping of words with special characters or numbers and more. The outcome of Wikipedia page processing is tab-separated files. Category & Article DocStore from Oracle Labs is used to remove non-usable information and to split the Wikipedia XML dumps into individual entities including article and category pages (ignoring other types of pages). The outcome of DocStore processing is text with HTML tags. ESA Model Build We calculate the number of incoming links for every page using cross-page links. ESA model is reduced to retain the pages that are more general and describe concepts, filtering out References, References and links, Sources, Further reading etc.. The final ESA model is built with a limit of 200,000 Features and 1,000 Top Features retained, resulting in some 27 mi records and 800 MB in size (current version) Steps used by the Oracle Team (internally) to Process the Wikipedia data Copyright © 2021, Oracle and/or its affiliates 20 XML Article pages Category pages TSV Pages TSV pages x-links TSV pages by category OML in-DB ESA Wiki Model

Slide 21

Slide 21 text

Download from https://oss.oracle.com/machine-learning/ Where can I download the new ESA Wiki model? Copyright © 2021, Oracle and/or its affiliates 21 Blog: New Wiki ESA model available for 19c https://blogs.oracle.com/machinelearning/post/wiki-esa-model-available-for-database-19c For complete examples, search OML Notebooks Template Examples for "ESA".

Slide 22

Slide 22 text

Quick demo 22 Copyright © 2021, Oracle and/or its affiliates • New URL to access OML • OML Services on Postman • OML Services new URL • OML Services new modelType filter • OML Services new Clustering ONNX model support • OML4Py REST APIs new URL

Slide 23

Slide 23 text

Q & A Copyright © 2021, Oracle and/or its affiliates 23

Slide 24

Slide 24 text

Thank you! Copyright © 2021, Oracle and/or its affiliates. 24