Slide 1

Slide 1 text

Francesco Tisiot Brendan Tierney Predicting good wine in the database with oracle machine learning @Ftisiot @BrendanTierney Rittman Mead Oralytics

Slide 2

Slide 2 text

Francesco Tisiot Rittman Mead @Ftisiot Brendan Tierney ORALYTICS @Brendantierney

Slide 3

Slide 3 text

Data Engineering Analytics Data Science www.rittmanmead.com [email protected] @rittmanmead

Slide 4

Slide 4 text

Can I have a glass of Wine? Yes Sir, what kind of wine? tell me what you have there! @Ftisiot @BrendanTierney

Slide 5

Slide 5 text

Can I have a glass of Wine? Yes Sir, what kind of wine? Red Wine! @Ftisiot @BrendanTierney

Slide 6

Slide 6 text

Can I have a glass of Wine? Yes, Do you like this Nice English pinot noir?? hell no! @Ftisiot @BrendanTierney

Slide 7

Slide 7 text

Can I have a glass of Wine? Yes, Do you like this Nice croatian cabernet?? Let me check…. 95% Yes! @Ftisiot @BrendanTierney

Slide 8

Slide 8 text

Dataset https://www.kaggle.com/zynicide/wine-reviews @Ftisiot @BrendanTierney

Slide 9

Slide 9 text

The Data @Ftisiot @BrendanTierney

Slide 10

Slide 10 text

Bad Good wine score @Ftisiot @BrendanTierney

Slide 11

Slide 11 text

wine description @Ftisiot @BrendanTierney

Slide 12

Slide 12 text

wine SAMPLES by country @Ftisiot @BrendanTierney

Slide 13

Slide 13 text

wine price by country @Ftisiot @BrendanTierney

Slide 14

Slide 14 text

good/bad wine by price @Ftisiot @BrendanTierney

Slide 15

Slide 15 text

From insights to predictions @Ftisiot @BrendanTierney

Slide 16

Slide 16 text

CLASSIFICATION Naïve Bayes Logistic Regression (GLM) Decision Tree Random Forest Neural Network Support Vector Machine Explicit Semantic Analysis CLUSTERING Hierarchical K-Means Hierarchical O-Cluster Expectation Maximization (EM) ANOMALY DETECTION One-Class SVM
 TIME SERIES Forecasting - Exponential Smoothing Includes popular models 
 e.g. Holt-Winters with trends, 
 seasonality, irregularity, missing data REGRESSION Linear Model Generalized Linear Model Support Vector Machine (SVM) Stepwise Linear regression Neural Network LASSO ATTRIBUTE IMPORTANCE Minimum Description Length Principal Comp Analysis (PCA) Unsupervised Pair-wise KL Div CUR decomposition for row & AI ASSOCIATION RULES A priori/ market basket PREDICTIVE QUERIES Predict, cluster, detect, features SQL ANALYTICS SQL Windows
 SQL Patterns
 SQL Aggregates •Includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc, Oracle Machine Learning Algorithms FEATURE EXTRACTION Principal Comp Analysis (PCA) Non-negative Matrix Factorization Singular Value Decomposition (SVD) Explicit Semantic Analysis (ESA) TEXT MINING SUPPORT Algorithms support text Tokenization and theme extraction Explicit Semantic Analysis (ESA) for document similarity STATISTICAL FUNCTIONS Basic statistics: min, max, 
 median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc. R PACKAGES Third-party R Packages 
 through Embedded Execution Spark MLlib algorithm integration MODEL DEPLOYMENT SQL—1st Class Objects Oracle RESTful API (ORDS) OML Microservices (for Apps) X1 X2 A1 A2 A3 A4 A5 A6 A7 Copyright © 2019 Oracle and/or its affiliates.

Slide 17

Slide 17 text

STATISTICAL FUNCTIONS Descriptive statistics (e.g. median, stdev, mode, sum, etc.)
 Hypothesis testing 
 (t-test, F-test, Kolmogorov-Smirnov test, Mann Whitney test, Wilcoxon Signed Ranks test
 Correlations analysis 
 (parametric and nonparametric e.g. 
 Pearson’s test for correlation, Spearman's rho coefficient, Kendall's tau-b correlation coefficient)
 Ranking functions
 Cross Tabulations with Chi-square statistics
 Linear regression
 ANOVA (Analysis of variance) Test Distribution fit 
 (e.g., Normal distribution test, 
 Binomial test, Weibull test, 
 Uniform test, Exponential test, 
 Poisson test) Statistical Aggregates 
 (min, max, mean, median, stdev, 
 mode, quantiles, plus x sigma, 
 minus x sigma, top n outliers, 
 bottom n outliers) Statistical Functions and Analytical SQL ANALYTICAL SQL SQL Windows
 SQL Aggregate functions LAG/LEAD functions SQL for Pattern Matching
 Additional approximate query processing: APPROX_COUNT, APPROX_SUM, APPROX_RANK Regular Expressions Copyright © 2019 Oracle and/or its affiliates.

Slide 18

Slide 18 text

Oracle Machine Learning Multiple Languages UIs Supported for End Users & Apps Development Application Developers DBAs R & Python Data Scientists “Citizen” Data Scientists Notebook Users & DS Teams Coming soon

Slide 19

Slide 19 text

BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'Wine_CLASS_MODEL', mining_function => dbms_data_mining.classification, data_table_name => ''Wine_TRAIN_DATA', case_id_column_name => 'ID', target_column_name => 'POINTS_BIN', settings_table_name => 'Wine_build_settings'); END; / CREATE ML MODEL BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'Wine_CLASS_MODEL', mining_function => dbms_data_mining.classification, data_table_name => ''Wine_TRAIN_DATA', case_id_column_name => 'ID', target_column_name => 'POINTS_BIN', settings_table_name => 'Wine_build_settings'); END; / BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'Wine_CLASS_MODEL', mining_function => dbms_data_mining.classification, data_table_name => ''Wine_TRAIN_DATA', case_id_column_name => 'ID', target_column_name => 'POINTS_BIN', settings_table_name => 'Wine_build_settings'); END; / BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'Wine_CLASS_MODEL', mining_function => dbms_data_mining.classification, data_table_name => ''Wine_TRAIN_DATA', case_id_column_name => 'ID', target_column_name => 'POINTS_BIN', settings_table_name => 'Wine_build_settings'); END; / BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'Wine_CLASS_MODEL', mining_function => dbms_data_mining.classification, data_table_name => ''Wine_TRAIN_DATA', case_id_column_name => 'ID', target_column_name => 'POINTS_BIN', settings_table_name => 'Wine_build_settings'); END; / BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'Wine_CLASS_MODEL', mining_function => dbms_data_mining.classification, data_table_name => ''Wine_TRAIN_DATA', case_id_column_name => 'ID', target_column_name => 'POINTS_BIN', settings_table_name => 'Wine_build_settings'); END; / BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'Wine_CLASS_MODEL', mining_function => dbms_data_mining.classification, data_table_name => ''Wine_TRAIN_DATA', case_id_column_name => 'ID', target_column_name => 'POINTS_BIN', settings_table_name => 'Wine_build_settings'); END; / @Ftisiot @BrendanTierney

Slide 20

Slide 20 text

apply ML MODEL SELECT PREDICTION_PROBABILITY( Wine_CLASS_MODEL, 'GT_90_POINTS' USING 25 as PRICE, ‘MALBEC' as VARIETY, ‘SPAIN' as COUNTRY ) FROM dual; SELECT PREDICTION_PROBABILITY( Wine_CLASS_MODEL, 'GT_90_POINTS' USING 25 as PRICE, ‘MALBEC' as VARIETY, ‘SPAIN' as COUNTRY ) FROM dual; SELECT PREDICTION_PROBABILITY( Wine_CLASS_MODEL, 'GT_90_POINTS' USING 25 as PRICE, ‘MALBEC' as VARIETY, ‘SPAIN' as COUNTRY ) FROM dual; SELECT PREDICTION_PROBABILITY( Wine_CLASS_MODEL, 'GT_90_POINTS' USING 25 as PRICE, ‘MALBEC' as VARIETY, ‘SPAIN' as COUNTRY ) FROM dual; @Ftisiot @BrendanTierney

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 22

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 24

Slide 25

Slide 25 text

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 25

Slide 26

Slide 26 text

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 26

Slide 27

Slide 27 text

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 27

Slide 28

Slide 28 text

Optional: Include Wine Reviews

Slide 29

Slide 29 text

Optional: Include Wine Reviews

Slide 30

Slide 30 text

Oracle APEX @Ftisiot @BrendanTierney

Slide 31

Slide 31 text

Oracle APEX @Ftisiot @BrendanTierney

Slide 32

Slide 32 text

Oracle APEX @Ftisiot @BrendanTierney

Slide 33

Slide 33 text

@Ftisiot @BrendanTierney

Slide 34

Slide 34 text

@Ftisiot @BrendanTierney

Slide 35

Slide 35 text

@Ftisiot @BrendanTierney

Slide 36

Slide 36 text

@Ftisiot @BrendanTierney

Slide 37

Slide 37 text

@Ftisiot @BrendanTierney

Slide 38

Slide 38 text

@Ftisiot @BrendanTierney

Slide 39

Slide 39 text

ML Model Deployment via ORDS REST API Launch Developmentà APEX @Ftisiot @BrendanTierney

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

ML Model Deployment via ORDS REST API @Ftisiot @BrendanTierney

Slide 42

Slide 42 text

ML Model Deployment via ORDS REST API Launch RESTful Services @Ftisiot @BrendanTierney

Slide 43

Slide 43 text

ML Model Deployment via ORDS REST API @Ftisiot @BrendanTierney

Slide 44

Slide 44 text

ML Model Deployment via ORDS REST API Schema enables for ORDS RESTful Services @Ftisiot @BrendanTierney

Slide 45

Slide 45 text

ML Model Deployment via ORDS REST API Helpful example templates provided @Ftisiot @BrendanTierney

Slide 46

Slide 46 text

ML Model Deployment via ORDS REST API Helpful example templates provided Build your own custom API For What-IF ML predictions / Scoring – micro-services @Ftisiot @BrendanTierney

Slide 47

Slide 47 text

ML Model Deployment via ORDS REST API RESTful API for calling 
 OML model to make predictions @Ftisiot @BrendanTierney

Slide 48

Slide 48 text

Real-time Wine Recommendation App + OpenDiningTable App Copyright © 2019 Oracle and/or its affiliates. @Ftisiot @BrendanTierney

Slide 49

Slide 49 text

ML in database Summary analytics security governance prod deployment knowledge sharing data exploration visualisations storytelling easy reuse @Ftisiot @BrendanTierney

Slide 50

Slide 50 text

What about the money? FREE! @Ftisiot @BrendanTierney

Slide 51

Slide 51 text

Francesco Tisiot Brendan Tierney Predicting good wine in the database with oracle machine learning @Ftisiot @BrendanTierney Rittman Mead Oralytics