Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When Machine Learning Meets Graph Databases

Gianni Ceresa
November 21, 2019

When Machine Learning Meets Graph Databases

Machine Learning is everywhere these days (just after AI), it started as a python and R thing, it joined the Oracle Database after and it's now available for Oracle Graph Database as well. Let's go through some examples of how graphs require to slightly adapt data preparation to run Machine Learning algorithms.

Gianni Ceresa

November 21, 2019
Tweet

More Decks by Gianni Ceresa

Other Decks in Technology

Transcript

  1. Copyright © 2017, Oracle and/or its affiliates. All rights reserved.

    | bit.ly/OracleACEProgram 450+ Technical Experts Helping Peers Globally Nominate yourself or someone you know: acenomination.oracle.com
  2. edge edge label edge properties edge ID directed edge vertex

    (node) vertex properties vertex ID a vertex can have a label
  3. PGX Scalable and Persistent Storage Graph Data Access Layer API

    Graph Analytics In-memory Analytic Engine Blueprints & SolrCloud / Lucene Property Graph Support on Files, Apache HBase, Oracle NoSQL or Oracle DB 12.2+ REST Web Service Python, Perl, PHP, Ruby, Javascript, … Java APIs Java APIs/JDBC/SQL/PLSQL Cytoscape Plug-in R Integration (OAAgraph) Spark integration SQL*Plus, …
  4. Spain Italy John Doe Company A Company B Company C

    Company D Located in Located in Located in Located in Buys from Buys from Buys from Buys from Money laundering and VAT frauds Owns
  5. • • • • • • • • • •

    • • • • • • • • • • • • • •
  6. Customer 1 Customer 3 Customer 2 Product 2 Product 3

    Product 4 Product 5 Product 1 Customer 1 is more similar to Customer 3 than Customer 2
  7. 7 6 8 1 2 3 4 5 9 10

    11 12 13 1 2 3 4 5 6 7 start
  8. 7 6 8 1 2 3 4 5 9 10

    11 12 13 1 2 3 4 5 6 7 start Walk (nodes) : 1 – 6 – 7 – 6 – 3 – 4 – 6 – 2 Walk length: 8
  9. (the details of the Word2vec implementation are of out of

    the scope of this presentation and would take too long to cover) n = layer size (by default 200 for DeepWalk in PGX) context word 1 , 2 , 3 , … , 1 , 2 , 3 , … , 1 , 2 , 3 , … , 1 , 2 , 3 , … , 1 , 2 , 3 , … , target word
  10. pgx> var similars = model.computeSimilars("Albert_Einstein", 10) pgx> similars.print() +-----------------------------------------+ |

    dstVertex | similarity | +-----------------------------------------+ | Albert_Einstein | 1.0000001192092896 | | Physics | 0.8664291501045227 | | Werner_Heisenberg | 0.8625140190124512 | | Richard_Feynman | 0.8496938943862915 | | List_of_physicists | 0.8415523767471313 | | Physicist | 0.8384397625923157 | | Max_Planck | 0.8370327353477478 | | Niels_Bohr | 0.8340970873832703 | | Quantum_mechanics | 0.8331197500228882 | | Special_relativity | 0.8280861973762512 | +-----------------------------------------+