Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Relational DB Versus Graph DB: And The Winner Is?

Bf71450537acca19e045ae6f7febdf9a?s=47 Gianni Ceresa
November 19, 2019

Relational DB Versus Graph DB: And The Winner Is?

No need to introduce Relational Databases anymore, and even less the power of SQL. But better to not ignore Graph Databases: they are powerful and have some obvious benefits compared to a relational database. Let's compare the two and the power of their query languages with some practical examples and see which wins!

Bf71450537acca19e045ae6f7febdf9a?s=128

Gianni Ceresa

November 19, 2019
Tweet

Transcript

  1. None
  2. None
  3. Copyright © 2017, Oracle and/or its affiliates. All rights reserved.

    | bit.ly/OracleACEProgram 450+ Technical Experts Helping Peers Globally Nominate yourself or someone you know: acenomination.oracle.com
  4. None
  5. • • There is a “bit” more than that, but

    let’s keep it simple…
  6. None
  7. Vertex edge A Property Graph (also called node)

  8. edge edge label edge properties edge ID directed edge vertex

    (node) vertex properties vertex ID a vertex can have a label
  9. PGX Scalable and Persistent Storage Graph Data Access Layer API

    Graph Analytics In-memory Analytic Engine Blueprints & SolrCloud / Lucene Property Graph Support on Files, Apache HBase, Oracle NoSQL or Oracle DB 12.2+ REST Web Service Python, Perl, PHP, Ruby, Javascript, … Java APIs Java APIs/JDBC/SQL/PLSQL Cytoscape Plug-in R Integration (OAAgraph) Spark integration SQL*Plus, …
  10. None
  11. From 45,700 nodes with 105,406 edges, to 85 nodes with

    218 edges in seconds Catalog RPD
  12. Spain Italy John Doe Company A Company B Company C

    Company D Located in Located in Located in Located in Buys from Buys from Buys from Buys from Money laundering and VAT frauds Owns
  13. None
  14. None
  15. Every row of a table has a fixed, identical structure

    Connections at a table level (not row) Nodes and edges can have any number of properties Connections at a node level (can be seen as row level)
  16. None
  17. • • • •

  18. SEGMENT CUSTOMER ACCOUNT TRANSACTION

  19. • • • •

  20. SELECT othercustomer.name, SUM(transaction.amount) as total_transaction FROM customer INNER JOIN account

    ON customer.id = account.owner INNER JOIN transaction ON account.id = transaction.src_account OR account.id = transaction.dst_account INNER JOIN account otheraccount ON (transaction.src_account = otheraccount.id OR transaction.dst_account = otheraccount.id) AND account.id <> otheraccount.id INNER JOIN customer othercustomer ON otheraccount.owner = othercustomer.id INNER JOIN segment othersegment ON othercustomer.segment = othersegment.id WHERE customer.name = 'Nikita’ AND othersegment.name IN ('Person', 'Company') GROUP BY othercustomer.name
  21. None
  22. SELECT owner.name, SUM(t.amount) AS total_transaction FROM financial_transactions MATCH (p:Person) -[:ownerOf]->

    (:Account) -[t:transaction]- (:Account) <- [:ownerOf]- (owner:Person|Company) WHERE p.name = 'Nikita' GROUP BY owner
  23. • • • • • • •

  24. • • • • • • • The power of

    marketing…
  25. • • • • • • •

  26. SELECT owner.name, SUM(t.amount) AS total_transaction FROM financial_transactions MATCH (p:Person) -[:ownerOf]->

    (:Account) -[t:transaction]- (:Account) <-[:ownerOf]- (owner:Person|Company) WHERE p.name = 'Nikita' GROUP BY owner
  27. SELECT othercustomer.name, SUM(transaction.amount) as total_transaction FROM customer INNER JOIN account

    ON customer.id = account.owner INNER JOIN transaction ON account.id = transaction.src_account OR account.id = transaction.dst_account INNER JOIN account otheraccount ON (transaction.src_account = otheraccount.id OR transaction.dst_account = otheraccount.id) AND account.id <> otheraccount.id INNER JOIN customer othercustomer ON otheraccount.owner = othercustomer.id INNER JOIN segment othersegment ON othercustomer.segment = othersegment.id WHERE customer.name = 'Nikita’ AND othersegment.name IN ('Person', 'Company') GROUP BY othercustomer.name SELECT owner.name, SUM(t.amount) AS total_transaction FROM financial_transactions MATCH (p:Person) -[:ownerOf]-> (:Account) -[t:transaction]- (:Account) <-[:ownerOf]- (owner:Person|Company) WHERE p.name = 'Nikita' GROUP BY owner
  28. None
  29. └ └ ID MANAGER 100 100 120 120 185 Steven

    Diana Tom SELECT id, name, LEVEL FROM employees CONNECT BY PRIOR id = manager Name
  30. (a vertex) –[connected by an edge to]-> (another vertex) Steven

    Diana Tom managedBy managedBy MATCH (v1)->(v2) MATCH (v1)–[e]->(v2) MATCH (v1)-(v2) MATCH (v1)–[e]-(v2) MATCH (v1)<-(v2) MATCH (v1)<-[e]-(v2) MATCH (v1)-/:label?/->(v2) MATCH (v1)-/:label*/->(v2) MATCH (v1)-/:label+/->(v2) MATCH (v1)-/:label{1,2}/->(v2)
  31. SELECT id, name, LEVEL FROM employees CONNECT BY PRIOR id

    = manager SELECT emp.name, mgr.name AS manager MATCH (emp:Employee) -/:managedBy*/-> (mgr:Employee)
  32. • •

  33. Shortest way from A to G ? A B C

    D E F G 2 min 2 min 10 min 5 min 4 min 3 min 4 min 8 min 8 min 3 min 2 min 6 min
  34. SELECT start.name, SUM(e.time) AS distance, dst.name MATCH SHORTEST( (start) -[e]->*

    (dst) ) WHERE start.name = 'A' AND dst.name = ‘G' A B C D E F G 2 min 2 min 10 min 5 min 4 min 3 min 4 min 8 min 8 min 3 min 2 min 6 min
  35. CREATE OR REPLACE FUNCTION evaluate_string ( p_string IN VARCHAR2 )

    RETURN NUMBER IS l_result NUMBER; BEGIN EXECUTE IMMEDIATE 'select '||p_string||' from dual' INTO l_result; RETURN l_result; END evaluate_string; / WITH t AS ( SELECT LEVEL number_of_hops, SYS_CONNECT_BY_PATH(distance,'+') route_cost, CONNECT_BY_ROOT src || SYS_CONNECT_BY_PATH(dst,'-') route, dst FROM connections pc CONNECT BY NOCYCLE PRIOR dst = src START WITH src = 'A' ) SELECT number_of_hops, route_cost, evaluate_string(route_cost) total_cost, route FROM t WHERE dst = 'G' ORDER BY 3 ASC;
  36. None
  37. • • SELECT ID(page), COUNT(e) as num_hops, ARRAY_AGG( ID(cat) )

    as path MATCH SHORTEST ( (page) (-[e]-> (cat))* (topcat) ) WHERE ID(page) = 'Max_Arthur' ORDER BY num_hops DESC The graph has 11’265’294 nodes, 68’163’762 edges. The query takes less than a second on a laptop.
  38. category:British_male_writers, category:British_writers, category:Writers_by_nationality, category:Writers, category:People_associated_with_books, category:Books, category:Printing, category:Chinese_inventions, category:Chinese_culture, category:China,

    category:Chinese- speaking_countries_and_territories, category:Chinese_language, category:Languages_of_Taiwan, category:Taiwanese_culture, category:Taiwan, category:Major_non-NATO_ally, category:NATO, category:Military_alliances_involving_Estonia, category:Military_history_of_Estonia, category:History_of_Estonia_by_topic, category:History_of_Estonia, category:Estonia, category:Post–Russian_Empire_states, category:Russian_Civil_War, category:Russian_Revolution, category:Leninism, category:Ideology_of_the_Communist_Party_of_the_Soviet_Union, category:Communist_Party_of_the_Soviet_Union, category:Stalinist_parties, category:Stalinism, category:Marxism–Leninism, category:Ideology_of_the_Communist_Party_of_China, category:Communist_Party_of_China, category:Maoist_parties, category:Maoist_organizations, category:Maoism, category:Mao_Zedong, category:20th-century_Chinese_poets, category:Chinese_poets_by_century, category:Chinese_poets, category:Chinese_poetry, category:Chinese_literature_by_medium, category:Chinese_media_by_medium, category:Media_in_China, category:Communications_in_China, category:Communications_in_Asia_by_country, category:Communications_in_Asia, category:Information_and_communication_technologies_in_Asia, category:Development_in_Asia, category:Development_by_continent It’s a MariaDB database, didn’t even try to write a SQL query…
  39. None
  40. pgx> var betweenness = analyst.approximateVertexBetweennessCentrality(graph, 2000) betweenness ==> VertexProperty[name=approx_betweenness,type=double,graph=wikipedia] pgx>

    var resultSet = graph.queryPgql("SELECT ID(v), v.approx_betweenness MATCH (v) ORDER BY v.approx_betweenness DESC LIMIT 10") resultSet ==> PgqlResultSetImpl[graph=wikipedia,numResults=10] pgx> resultSet.print(10) +-------------------------------------------------------------------+ | ID(v) | v.approx_betweenness | +-------------------------------------------------------------------+ | category:NATO | 3335099.4280233113 | | category:French_language | 2528594.2608384034 | | category:Major_non-NATO_ally | 2439938.8292453424 | | category:Francophonie | 2302727.8457136396 | | category:Languages_written_in_Latin_script | 2195355.50705401 | | category:Latin_script | 2189237.6630984987 | | category:Postâ??Russian_Empire_states | 2046648.4079036822 | | category:Russian_Civil_War | 2037548.3718300906 | | category:Years | 2031954.4741504523 | | category:United_States | 1985616.4889351444 | +-------------------------------------------------------------------+ $4 ==> PgqlResultSetImpl[graph=wikipedia,numResults=10]
  41. None
  42. • • • • • • •

  43. Graph DB Relational DB

  44. None