Slide 1

Slide 1 text

Oracle Machine Learning Notebooks Deep Dive Mark Hornick Oracle Machine Learning Product Management Copyright © 2020 Oracle and/or its affiliates.

Slide 2

Slide 2 text

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation. Safe harbor statement Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 3

Slide 3 text

Copyright © 2020, Oracle and/or its affiliates 3 • Quick Orientation Demo • Objects and Relationships • Projects and Workspaces • Notebook Basics • Collaboration • Connection Groups and Interpreter Bindings • Scheduling Jobs for Notebook Execution • Output Formats in Notebooks • Input Forms in Notebooks • Lessor-known OML4SQL Features OML Notebooks Topics

Slide 4

Slide 4 text

Collaborative UI • Based on Apache Zeppelin • Supports data scientists, data analysts, application developers, DBAs • Easy sharing of notebooks and templates • Permissions, versioning, and execution scheduling Included with Autonomous Database • Automatically provisioned, managed, backed up • In-database SQL algorithms and analytics functions • Explore and prepare data, build and evaluate models, score data, deploy solutions • SQL today — soon to be augmented with Python and R Autonomous Database as a Data Science Platform Oracle Machine Learning Notebooks Copyright © 2020 Oracle and/or its affiliates.

Slide 5

Slide 5 text

Copyright © 2020, Oracle and/or its affiliates 5 Need an account? https://www.oracle.com/cloud/free/

Slide 6

Slide 6 text

Quick Orientation Demo Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 7

Slide 7 text

Copyright © 2020, Oracle and/or its affiliates 7 Login Hamburger menu: Home, Notebooks, Templates, Jobs, Connection Groups, Notebook Sessions Home page: How do I?, Quick Actions, Recent Activities, Recent Notebooks Project Selection and Workspace management User: preferences, password changes, Help, About, Log out Notebooks page Templates • Personal Page • Shared Page • Examples Jobs Connection groups Notebooks Sessions Script – quick overview of OML Notebooks environment

Slide 8

Slide 8 text

Objects and Relationships Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 9

Slide 9 text

Copyright © 2020, Oracle and/or its affiliates 9 Autonomous Database and OML Users Autonomous Database Pluggable Database (PDB) OML User - 1 OML User - 2 OML User - N …

Slide 10

Slide 10 text

Copyright © 2020, Oracle and/or its affiliates 10 OML Notebooks – workspaces, projects, notebooks OML User Workspace Workspace Project Project Project Notebook Notebook Notebook Notebook Scratchpad Notebook Notebook Notebook Scratchpad Notebook Scratchpad Notebook … … …

Slide 11

Slide 11 text

Copyright © 2020, Oracle and/or its affiliates 11 A Data Science Team Structure Data Scientist DBA Application Developer Executive Data Analyst Project Lead Provides reports to Oversees Data Management Support Provides analytics to

Slide 12

Slide 12 text

Copyright © 2020, Oracle and/or its affiliates 12 Autonomous Database and OML – Data Science Project Scenario DB1 Autonomous Database PDB MARK Project Lead PETE App Developer CINDY Data Analyst ADMIN ANNE Data Scientist SERGIO Executive TIM DBA GOLD_DATA

Slide 13

Slide 13 text

Copyright © 2020, Oracle and/or its affiliates 13 Users from ADMIN User Administration System Administrator privileges • Create, edit, and delete OML user accounts • Read-only access to notebooks • Cannot create or run notebooks • Reassign user workspace when deleting users Developer privileges • Create projects and workspaces • Create, edit, delete, version, and run notebooks • Use SQL and SQL Script Scratchpads • Schedule jobs for notebook execution To follow along with the slides in your account, use the OML Notebooks ADMIN interface to create users as shown in this screen capture, or minimally create two users.

Slide 14

Slide 14 text

Copyright © 2020, Oracle and/or its affiliates 14 OML Notebooks – workspaces, projects, notebooks OML User: MARK – project lead MARK Workspace – default DevTest Workspace MARK Project Customer Segmentation Demand Forecasting Notebook Notebook Notebook Scratchpad Notebook Notebook Scratchpad Notebook Scratchpad Notebook … Create Customer Segmentation, Demand Forecasting, and Project Examples projects in DevTest workspace shared with other team members as Developer or Viewer …

Slide 15

Slide 15 text

Copyright © 2020, Oracle and/or its affiliates 15 Data Science Team Workspace Permissions Data Scientist Developer DBA Developer Application Developer Viewer Executive Viewer Data Analyst Developer Project Lead DevTest Workspace owner • Create, edit, and delete OML user accounts • Read-only access to notebooks • Cannot create or run notebooks • Reassign user workspace when deleting users ADMIN • Provide notebook reports to Executive • Review team’s work and collaborate using concurrent notebook sessions • Explore, prepare, and analyze data • Build and test ML models • Deliver ML scripts to data analyst and application developer • View notebook reports on project from the project lead • View any DevTest workspace notebooks • Access machine learning scripts to include in applications and dashboards • Use SQL or REST for scoring data with ML models • Read-only view of others’ scripts • Analyze data using Notebooks or BI tools like Oracle Analytics Cloud • Use machine learning scripts from data scientist • Load and prepare data for team • Collaborate with other users to assist with queries and data analysis

Slide 16

Slide 16 text

Copyright © 2020, Oracle and/or its affiliates 16 User Roles from MARK ‘DevTest’ Workspace Permissions

Slide 17

Slide 17 text

Projects and Workspaces Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 18

Slide 18 text

Copyright © 2020, Oracle and/or its affiliates 18 Initial workspace and default project created by OML service automatically when you first log in to OML CINDY Data Analyst creates a new project within a new workspace using the Create Project dialog and shares her workspace with other users. Note: users must be in a project where they have manager rights. Manage Workspaces • On the top right corner in your home page, click Select a Project pulldown menu • Select Manage Workspaces • Manage Workspaces dialog box opens • Click Permissions to add and delete users • Click Edit to edit the selected workspace • Click Delete to delete the selected workspace • When finished, click Close. Projects and Workspaces

Slide 19

Slide 19 text

Copyright © 2020, Oracle and/or its affiliates 19 Collaborate with other users by granting permissions to access your workspace Grant permissions Manager, Developer, and Viewer to allow another user to view your workspace Granting permission type Manager or Developer allows user to drop tables, create tables, and run any scripts at any time on your account Granting permission type Manager allows the creation of new projects and workspaces Workspace Permissions • On the top right corner in your home page, click Select a Project pulldown menu • Select Workspace Permissions • Workspace Permissions dialog box opens • Select user name and permission type, then click + Add • Select row and click Delete to remove user permissions Note: re-adding a user overwrites previous settings – use this to change settings More on Workspace Permissions

Slide 20

Slide 20 text

Copyright © 2020, Oracle and/or its affiliates 20 Viewer • Project: view only • Workspace: view only • Notebooks: view only • Jobs: view jobs and job runs of shared notebooks only Note: Users are not authorized to run or make any changes to your notebooks – useful for “reports” Developer • Project: view only • Workspace: view only • Notebooks: create, update, run, and delete notebooks that a developer creates only • Jobs: view and run jobs of shared notebooks only. A developer cannot create jobs for notebooks that are shared Note: Users can drop tables and run any script at any time on the owner’s account Manager • Project: create, update, delete • Workspace: view only • Notebooks: create, update, run, delete • Jobs: schedule jobs Note: Users can drop tables and run any script at any time on the owner’s account Workspace Permissions

Slide 21

Slide 21 text

A Few Notebook Basics Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 22

Slide 22 text

Copyright © 2020, Oracle and/or its affiliates 22 On the Notebooks page, click Create Must be in a workspace/project you can write to ANNE Data Scientist creates a notebook in DevTest Workspace and Customer Segmentation Project This is now visible to other team members who have access to DevTest Workspace Create a Notebook

Slide 23

Slide 23 text

Copyright © 2020, Oracle and/or its affiliates 23 On the Notebooks page, click Import Select one or more json notebook files in folder to import Must be in a workspace/project you can write to ANNE Data Scientist imports a OML4SQL Clustering notebook in DevTest Workspace and Customer Segmentation Project Discussed later: check interpreter bindings are set to enable proper notebooks execution Import a Notebook

Slide 24

Slide 24 text

Copyright © 2020, Oracle and/or its affiliates 24 Scratchpads are “glorified” notebooks SQL and Script scratchpads can have paragraphs of either %sql or %script, and later %python with OML4Py TIM DBA opens a SQL Query Scratchpad and lists the tables in all_tables If you return to Notebooks page, your Scratchpad is saved with name “SQL Query Scratchpad” or “SQL Script Scratchpad” Users can rename a Scratchpad notebook Renaming or deleting a default-named Scratchpad notebook gets you a clean Scratchpad next access Save a Scratchpad as a Notebook

Slide 25

Slide 25 text

Copyright © 2020, Oracle and/or its affiliates 25 On the Notebooks page, MARK Project Lead selects notebook to version and clicks Version This allows seeing and manipulating existing versions, as well as create new ones On Versions page • Click + Version • Create Versions dialog opens • Enter comments for this version of your notebook, and click OK The versioned notebook is now listed in the Versions page. You can… • Click Revert Version to restore the older version of your notebook • Click Delete to delete the selected version of your notebook • Click New Notebook to create a new notebook from the selected notebook version • Results in a notebook with same name + “_i”, where i = 1..n each time invoked Version a Notebook

Slide 26

Slide 26 text

Copyright © 2020, Oracle and/or its affiliates 26 OML Templates UI provides another way to share work • Publish notebooks as read-only reports • Publish snapshots of notebooks • Allow users to create notebooks from templates ANNE Data Scientist, on the Notebooks page, selects a notebook and click Save as Template Save as Template dialog opens • Name field – enter a name for the notebook template • Comments field – enter comments, if any • Tags field – enter tags for the template, if any • In Save To, select: • Personal: If you want to save this notebook template to Personal • Shared: If you want to save and share this notebook template Save a Notebook as a Template

Slide 27

Slide 27 text

Collaboration Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 28

Slide 28 text

Copyright © 2020, Oracle and/or its affiliates 28 Share your work, publish your work as reports, or create notebooks from templates Granting Workspace Access To collaborate, grant access to your workspace to another user When other users log in, your workspace, projects in that workspace, and any workspace notebooks are visible Open the same notebook in different users’ accounts to dynamically see others’ changes All can open notebooks and work on or see changes in real time relative to granted permissions OML Templates UI Three types: Personal, Shared, and Examples View selected templates in read-only mode Create new editable/runnable notebooks from selected templates Collaborate with other users

Slide 29

Slide 29 text

Copyright © 2020, Oracle and/or its affiliates 29 Autonomous Database and OML – limited architectural view Autonomous Database Pluggable Database (PDB) User User User … OML Repository Personal Templates Notebook Notebook Shared Templates Notebook Notebook Example Templates Notebook Notebook

Slide 30

Slide 30 text

Connection Groups and Interpreter Bindings Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 31

Slide 31 text

Copyright © 2020, Oracle and/or its affiliates 31 A connection group, also known as a Zeppelin interpreter set, is a collection of database connections Global Connection Group is created automatically when a new autonomous database is provisioned Developer • View database connections Administrator • View and Edit database connections: Name, Row Render Limit, Comment • Reset and Stop database connections • In the Database section, specify PL/SQL DBMS output option • Select Enabled to allow the PL/SQL interpreter to display messages sent to DBMS_OUTPUT in paragraph results • Requires users to log back in for change to take effect Connection Groups

Slide 32

Slide 32 text

Copyright © 2020, Oracle and/or its affiliates 32 Interpreters • A plug-in for using a specific data processing language at the backend, e.g., SQL, Python, R • Currently supported interpreters • sql and script (PL/SQL) interpreters within an Oracle Database interpreter group • md (MarkDown) interpreter for plain text formatting syntax for conversion to HTML Interpreter Bindings • Association of interpreters to notebooks, and individual paragraphs • Notebooks contain an internal ordered list of bindings, default order  Low, Medium, High • *Low: least level of resources to each SQL statement, maximizes concurrent SQL statements • Medium: lower level of resources to each SQL statement, allows more concurrent SQL statements • High: highest level of resources to each SQL statement resulting in the highest performance, but significantly limits number of concurrent SQL statements • Toggle bindings on/off and drag buttons to order – top bindings take precedence over lower bindings Interpreters and Interpreter Bindings * Only non-parallel Low access is supported with the Always Free account

Slide 33

Slide 33 text

Copyright © 2020, Oracle and/or its affiliates 33 May want to override execution priority / resources for an individual SQL interpreter paragraph To change the interpreter bindings order for a particular SQL paragraph in the a notebook: • Click the paragraph where you want to change the interpreter • Invoke the interpreter with the specific binding • Run the paragraph For example, TIM DBA invokes Scratchpad with medium resource allocation, where db1 is the database name %db1_medium Validate the interpreter binding by executing SELECT SYS_CONTEXT ('USERENV', 'SERVICE_NAME') FROM DUAL; Override Interpreter Bindings within a Paragraph

Slide 34

Slide 34 text

Copyright © 2020, Oracle and/or its affiliates 34 If you do not bind any specific interpreter to your notebook, you receive the error: databasename_servicename not found TIM DBA explicitly sets the interpreter binding order in the following scenarios: • Notebook creation: notebook inherits the initial interpreter binding order, which is low (default), medium, high • Notebook import: notebook inherits the defined interpreter bindings – after importing a notebook, verify interpreter binding order and activation • Notebook export: notebook inherits the defined interpreter bindings • Interpreter bindings are not portable across Pluggable Databases (PDBs) • If export and import notebooks within the same PDB, interpreter bindings are preserved • If notebooks are created in different PDBs, interpreter bindings are not preserved after import, and must be manually set • Notebook creation from templates: notebook inherits the default order of interpreter bindings Interpreter Bindings

Slide 35

Slide 35 text

Scheduling Jobs for Notebook Execution Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 36

Slide 36 text

Copyright © 2020, Oracle and/or its affiliates 36 Under what circumstances would we want to schedule notebook execution? • Execute a long-running notebook at off-peak hours, perhaps 2:00 AM • Rebuild one or more models at periodic intervals, e.g., nightly or weekly, on the latest data • Perform batch scoring of latest data at periodic intervals, e.g., nightly • Generate notebook “report” with latest analytics results, including tables and graphs Jobs use cases

Slide 37

Slide 37 text

Copyright © 2020, Oracle and/or its affiliates 37 Jobs allow you to schedule the running of notebooks Operations on Jobs • Edit – Edit job metadata of any job listed in the Jobs page • Create – Create a new job to schedule your Notebook • Duplicate – Create a copy of an existing job listed in the Jobs page • Stop – Terminate a job that is currently running • Start – Enabled only for jobs that are in Scheduled status • Delete – Delete any job listed on the Jobs page Example MARK Project Lead schedules Clustering notebook to run daily, but clicks Start to also run it immediately Jobs

Slide 38

Slide 38 text

Copyright © 2020, Oracle and/or its affiliates 38 Maximum Number of Runs • Maximum number of times the job will be run given repeat frequency • If notebook times out or exceeds max runs allowed, status is set to COMPLETED Maximum Failures Allowed • Maximum number of times a job can fail on consecutive scheduled runs • If notebook execution fails more than max failures allowed, status is set to BROKEN Timeout in Minutes • Maximum amount of time a job should be allowed to run, otherwise it is stopped • If notebook execution exceeds specified timeout, status is set to STOPPED Jobs Advanced Settings

Slide 39

Slide 39 text

Copyright © 2020, Oracle and/or its affiliates 40 Historical logs of jobs in current user’s Job Log interface On Jobs page, click on name of job to view On the Job Log page for the selected job • View Date of execution, Status (success, running, stopped, failed), and Duration • To view notebook execution results from job history – select the row and click view icon • To delete a job log entry – select it and click delete icon Job Logs

Slide 40

Slide 40 text

Output Formats in Notebooks Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 41

Slide 41 text

Copyright © 2020, Oracle and/or its affiliates 42 Use SQLFORMAT to format query output %script SET SQLFORMAT format_option MARK Project Lead created some examples in the Project Examples project in the DevTest workspace For example, SET SQLFORMAT ansiconsole; SELECT * FROM SH.SALES; Set Output Format in Notebooks Import Notebooks 1), 2), and 3) into the Project Examples project

Slide 42

Slide 42 text

Copyright © 2020, Oracle and/or its affiliates 43 The available output formats are: • ANSICONSOLE — resizes the columns to the width of the data to save space. It also underlines the columns, instead of separate line of output • CSV —standard comma-separated variable output, with string values enclosed in double quotes • DELIMITED — manually define the delimiter string, and the characters that are enclosed in the string values • FIXED —fixed width columns with all data enclosed in double-quotes • HTML —HTML for a responsive table. The content of the table changes dynamically to match the search string entered in the text field • INSERT— the INSERT statements that could be used to recreate the rows in a table. • JSON — a JSON document containing the definitions of the columns along with the data that it contains. • LOADER — pipe delimited output with string values enclosed in double quotes—column names not included in the output • XML —a tag-based XML document. All data is presented as CDATA tags • DEFAULT — clears all previous SQLFORMAT settings, and returns to the default output, also, just specify SET SQLFORMAT Set Output Format in Notebooks

Slide 43

Slide 43 text

Input Forms in Notebooks Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 44

Slide 44 text

Copyright © 2020, Oracle and/or its affiliates 45 Request user input for a value '${formName}’ or with default value ’${formName=defaultValue}’ SELECT * FROM ALL_OBJECTS WHERE OBJECT_TYPE = '${OBJ}'; SELECT * FROM ALL_OBJECTS WHERE OBJECT_TYPE = '${OBJ=TABLE}' Create Text Input Forms in Notebooks

Slide 45

Slide 45 text

Copyright © 2020, Oracle and/or its affiliates 46 Define the Select form by using the syntax: '${formName=defaultValue,option1|option2...}' SELECT * FROM ALL_OBJECTS WHERE OBJECT_TYPE = '${OBJ=INDEX,INDEX|TABLE|VIEW|SYNONYM}' Create Select Forms in Notebooks

Slide 46

Slide 46 text

Copyright © 2020, Oracle and/or its affiliates 47 '${checkbox:formName=defaultValue1|defaultValue2...,option1|option2...}' SELECT ${checkbox:Which Columns=OWNER|OBJECT_TYPE, OWNER|OBJECT_NAME|OBJECT_TYPE|STATUS} FROM ALL_OBJECTS Create Check Box Forms in Notebooks

Slide 47

Slide 47 text

Lessor-known OML4SQL Features Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 48

Slide 48 text

Unstructured text data: web pages, document libraries, Power Point presentations, product specifications, emails, comment fields in reports, and call center notes Extracting meaningful information from unstructured text can enhance ML results OML… • uses Oracle Text utilities and term weighting strategies to prepare text for TEXT columns • passes user-provided text configuration to Oracle Text • uses extracted tokens or themes for model building • supports text for column data types VARCHAR2, CHAR, CLOB, BLOB, and BFILE Oracle Text: an Oracle Database technology for term extraction, word and theme searching, and other utilities for querying text Enable via ADB PDB Admin by invoking “GRANT RESOURCE, CONNECT, CTXAPP TO user;” OML processing text data Text Processing/Mining Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 49

Slide 49 text

Builds ensemble model with multiple sub-models, one for each data partition • Potentially achieve better accuracy through multiple targeted models • Sub-models automatically managed and used as one model Simplified scoring using top-level model only • Proper sub-model chosen by system based on row of data to be scored Automates a typical machine learning task for data scientists Partitioned Models Oracle Database Table Specify Partition Column(s) Partition-1 Partition-2 Partition-3 Partition-n … Sub-Model-1 Sub-Model-2 Sub-Model-3 Sub-Model-n Top Level Model New Data Score data using top level model In-DB Algorithm … Copyright © 2020, Oracle and/or its affiliates. All rights reserved

Slide 50

Slide 50 text

For more information… oracle.com/machine-learning Copyright © 2020 Oracle and/or its affiliates.

Slide 51

Slide 51 text

Copyright © 2020, Oracle and/or its affiliates 52 Q & A

Slide 52

Slide 52 text

Thank You Mark Hornick | mark.hornick@oracle.com Oracle Machine Learning Product Management