When Machine Learning Meets Graph Databases

Bf71450537acca19e045ae6f7febdf9a?s=47 Gianni Ceresa
November 21, 2019

When Machine Learning Meets Graph Databases

Machine Learning is everywhere these days (just after AI), it started as a python and R thing, it joined the Oracle Database after and it's now available for Oracle Graph Database as well. Let's go through some examples of how graphs require to slightly adapt data preparation to run Machine Learning algorithms.

Bf71450537acca19e045ae6f7febdf9a?s=128

Gianni Ceresa

November 21, 2019
Tweet

Transcript

  1. 1.
  2. 2.
  3. 3.

    Copyright © 2017, Oracle and/or its affiliates. All rights reserved.

    | bit.ly/OracleACEProgram 450+ Technical Experts Helping Peers Globally Nominate yourself or someone you know: acenomination.oracle.com
  4. 4.
  5. 6.

    edge edge label edge properties edge ID directed edge vertex

    (node) vertex properties vertex ID a vertex can have a label
  6. 7.

    PGX Scalable and Persistent Storage Graph Data Access Layer API

    Graph Analytics In-memory Analytic Engine Blueprints & SolrCloud / Lucene Property Graph Support on Files, Apache HBase, Oracle NoSQL or Oracle DB 12.2+ REST Web Service Python, Perl, PHP, Ruby, Javascript, … Java APIs Java APIs/JDBC/SQL/PLSQL Cytoscape Plug-in R Integration (OAAgraph) Spark integration SQL*Plus, …
  7. 8.
  8. 9.
  9. 10.

    Spain Italy John Doe Company A Company B Company C

    Company D Located in Located in Located in Located in Buys from Buys from Buys from Buys from Money laundering and VAT frauds Owns
  10. 11.
  11. 12.
  12. 13.
  13. 14.
  14. 15.

    • • • • • • • • • •

    • • • • • • • • • • • • • •
  15. 18.
  16. 19.
  17. 22.
  18. 23.
  19. 25.

    Customer 1 Customer 3 Customer 2 Product 2 Product 3

    Product 4 Product 5 Product 1 Customer 1 is more similar to Customer 3 than Customer 2
  20. 27.
  21. 29.
  22. 30.
  23. 31.
  24. 33.
  25. 38.
  26. 39.

    7 6 8 1 2 3 4 5 9 10

    11 12 13 1 2 3 4 5 6 7 start
  27. 40.

    7 6 8 1 2 3 4 5 9 10

    11 12 13 1 2 3 4 5 6 7 start Walk (nodes) : 1 – 6 – 7 – 6 – 3 – 4 – 6 – 2 Walk length: 8
  28. 44.

  29. 45.
  30. 46.

    (the details of the Word2vec implementation are of out of

    the scope of this presentation and would take too long to cover) n = layer size (by default 200 for DeepWalk in PGX) context word 1 , 2 , 3 , … , 1 , 2 , 3 , … , 1 , 2 , 3 , … , 1 , 2 , 3 , … , 1 , 2 , 3 , … , target word
  31. 47.
  32. 49.

    pgx> var similars = model.computeSimilars("Albert_Einstein", 10) pgx> similars.print() +-----------------------------------------+ |

    dstVertex | similarity | +-----------------------------------------+ | Albert_Einstein | 1.0000001192092896 | | Physics | 0.8664291501045227 | | Werner_Heisenberg | 0.8625140190124512 | | Richard_Feynman | 0.8496938943862915 | | List_of_physicists | 0.8415523767471313 | | Physicist | 0.8384397625923157 | | Max_Planck | 0.8370327353477478 | | Niels_Bohr | 0.8340970873832703 | | Quantum_mechanics | 0.8331197500228882 | | Special_relativity | 0.8280861973762512 | +-----------------------------------------+