Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Relational DB Versus Graph DB: And The Winner Is?

Gianni Ceresa
November 19, 2019

Relational DB Versus Graph DB: And The Winner Is?

No need to introduce Relational Databases anymore, and even less the power of SQL. But better to not ignore Graph Databases: they are powerful and have some obvious benefits compared to a relational database. Let's compare the two and the power of their query languages with some practical examples and see which wins!

Gianni Ceresa

November 19, 2019
Tweet

More Decks by Gianni Ceresa

Other Decks in Technology

Transcript

  1. View Slide

  2. View Slide

  3. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
    bit.ly/OracleACEProgram
    450+ Technical Experts
    Helping Peers Globally
    Nominate yourself or someone you know: acenomination.oracle.com

    View Slide

  4. View Slide



  5. There is a “bit” more than that, but let’s keep it simple…

    View Slide

  6. View Slide

  7. Vertex
    edge
    A Property Graph
    (also called node)

    View Slide

  8. edge
    edge label
    edge properties
    edge ID
    directed edge
    vertex (node)
    vertex
    properties
    vertex ID
    a vertex can
    have a label

    View Slide

  9. PGX
    Scalable and Persistent Storage
    Graph Data Access Layer API
    Graph Analytics
    In-memory Analytic Engine
    Blueprints & SolrCloud / Lucene
    Property Graph Support on
    Files, Apache HBase, Oracle NoSQL or Oracle DB 12.2+
    REST Web Service
    Python, Perl, PHP, Ruby,
    Javascript, …
    Java APIs
    Java APIs/JDBC/SQL/PLSQL
    Cytoscape Plug-in
    R Integration (OAAgraph)
    Spark integration
    SQL*Plus, …

    View Slide

  10. View Slide

  11. From 45,700 nodes
    with 105,406 edges,
    to 85 nodes with
    218 edges in
    seconds
    Catalog
    RPD

    View Slide

  12. Spain
    Italy
    John Doe
    Company A
    Company B
    Company C
    Company D
    Located in
    Located in
    Located in
    Located in
    Buys from
    Buys from
    Buys from
    Buys from
    Money laundering
    and VAT frauds
    Owns

    View Slide

  13. View Slide

  14. View Slide

  15. Every row of a table has a fixed, identical structure
    Connections at a table level (not row)
    Nodes and edges can have any number of properties
    Connections at a node level (can be seen as row level)

    View Slide

  16. View Slide





  17. View Slide

  18. SEGMENT
    CUSTOMER
    ACCOUNT TRANSACTION

    View Slide





  19. View Slide

  20. SELECT othercustomer.name, SUM(transaction.amount) as total_transaction
    FROM customer
    INNER JOIN account
    ON customer.id = account.owner
    INNER JOIN transaction
    ON account.id = transaction.src_account OR account.id = transaction.dst_account
    INNER JOIN account otheraccount
    ON (transaction.src_account = otheraccount.id
    OR transaction.dst_account = otheraccount.id)
    AND account.id <> otheraccount.id
    INNER JOIN customer othercustomer
    ON otheraccount.owner = othercustomer.id
    INNER JOIN segment othersegment
    ON othercustomer.segment = othersegment.id
    WHERE customer.name = 'Nikita’
    AND othersegment.name IN ('Person', 'Company')
    GROUP BY othercustomer.name

    View Slide

  21. View Slide

  22. SELECT owner.name, SUM(t.amount) AS total_transaction
    FROM financial_transactions
    MATCH (p:Person) -[:ownerOf]-> (:Account) -[t:transaction]- (:Account) <-
    [:ownerOf]- (owner:Person|Company)
    WHERE p.name = 'Nikita'
    GROUP BY owner

    View Slide








  23. View Slide








  24. The power of marketing…

    View Slide








  25. View Slide

  26. SELECT owner.name, SUM(t.amount) AS total_transaction
    FROM financial_transactions
    MATCH (p:Person) -[:ownerOf]-> (:Account) -[t:transaction]- (:Account)
    <-[:ownerOf]- (owner:Person|Company)
    WHERE p.name = 'Nikita'
    GROUP BY owner

    View Slide

  27. SELECT othercustomer.name, SUM(transaction.amount) as total_transaction
    FROM customer
    INNER JOIN account
    ON customer.id = account.owner
    INNER JOIN transaction
    ON account.id = transaction.src_account OR account.id = transaction.dst_account
    INNER JOIN account otheraccount
    ON (transaction.src_account = otheraccount.id
    OR transaction.dst_account = otheraccount.id)
    AND account.id <> otheraccount.id
    INNER JOIN customer othercustomer
    ON otheraccount.owner = othercustomer.id
    INNER JOIN segment othersegment
    ON othercustomer.segment = othersegment.id
    WHERE customer.name = 'Nikita’
    AND othersegment.name IN ('Person', 'Company')
    GROUP BY othercustomer.name
    SELECT owner.name, SUM(t.amount) AS total_transaction
    FROM financial_transactions
    MATCH (p:Person) -[:ownerOf]-> (:Account) -[t:transaction]- (:Account) <-[:ownerOf]- (owner:Person|Company)
    WHERE p.name = 'Nikita'
    GROUP BY owner

    View Slide

  28. View Slide



  29. ID MANAGER
    100
    100
    120
    120
    185
    Steven
    Diana
    Tom
    SELECT id, name, LEVEL
    FROM employees
    CONNECT BY PRIOR id = manager
    Name

    View Slide

  30. (a vertex) –[connected by an edge to]-> (another vertex)
    Steven
    Diana
    Tom
    managedBy
    managedBy
    MATCH (v1)->(v2)
    MATCH (v1)–[e]->(v2)
    MATCH (v1)-(v2)
    MATCH (v1)–[e]-(v2)
    MATCH (v1)<-(v2)
    MATCH (v1)<-[e]-(v2)
    MATCH (v1)-/:label?/->(v2)
    MATCH (v1)-/:label*/->(v2)
    MATCH (v1)-/:label+/->(v2)
    MATCH (v1)-/:label{1,2}/->(v2)

    View Slide

  31. SELECT id, name, LEVEL
    FROM employees
    CONNECT BY PRIOR id = manager
    SELECT emp.name, mgr.name AS manager
    MATCH (emp:Employee) -/:managedBy*/-> (mgr:Employee)

    View Slide



  32. View Slide

  33. Shortest way
    from A to G ?
    A
    B
    C
    D
    E
    F
    G
    2 min
    2 min
    10 min
    5 min
    4 min
    3 min
    4 min
    8 min
    8 min
    3 min
    2 min
    6 min

    View Slide

  34. SELECT start.name, SUM(e.time) AS distance, dst.name
    MATCH SHORTEST( (start) -[e]->* (dst) )
    WHERE start.name = 'A' AND dst.name = ‘G'
    A
    B
    C
    D
    E
    F
    G
    2 min
    2 min
    10 min
    5 min
    4 min
    3 min
    4 min
    8 min
    8 min
    3 min
    2 min
    6 min

    View Slide

  35. CREATE OR REPLACE
    FUNCTION evaluate_string ( p_string IN VARCHAR2 ) RETURN NUMBER
    IS
    l_result NUMBER;
    BEGIN
    EXECUTE IMMEDIATE
    'select '||p_string||' from dual'
    INTO l_result;
    RETURN l_result;
    END evaluate_string;
    /
    WITH t AS (
    SELECT LEVEL number_of_hops, SYS_CONNECT_BY_PATH(distance,'+') route_cost, CONNECT_BY_ROOT src
    || SYS_CONNECT_BY_PATH(dst,'-') route, dst
    FROM connections pc
    CONNECT BY NOCYCLE PRIOR dst = src
    START WITH src = 'A'
    )
    SELECT number_of_hops, route_cost, evaluate_string(route_cost) total_cost, route
    FROM t
    WHERE dst = 'G'
    ORDER BY 3 ASC;

    View Slide

  36. View Slide



  37. SELECT ID(page), COUNT(e) as num_hops, ARRAY_AGG( ID(cat) ) as path
    MATCH SHORTEST ( (page) (-[e]-> (cat))* (topcat) )
    WHERE ID(page) = 'Max_Arthur'
    ORDER BY num_hops DESC
    The graph has 11’265’294 nodes, 68’163’762 edges.
    The query takes less than a second on a laptop.

    View Slide

  38. category:British_male_writers, category:British_writers, category:Writers_by_nationality,
    category:Writers, category:People_associated_with_books, category:Books, category:Printing,
    category:Chinese_inventions, category:Chinese_culture, category:China, category:Chinese-
    speaking_countries_and_territories, category:Chinese_language, category:Languages_of_Taiwan,
    category:Taiwanese_culture, category:Taiwan, category:Major_non-NATO_ally, category:NATO,
    category:Military_alliances_involving_Estonia, category:Military_history_of_Estonia,
    category:History_of_Estonia_by_topic, category:History_of_Estonia, category:Estonia,
    category:Post–Russian_Empire_states, category:Russian_Civil_War, category:Russian_Revolution,
    category:Leninism, category:Ideology_of_the_Communist_Party_of_the_Soviet_Union,
    category:Communist_Party_of_the_Soviet_Union, category:Stalinist_parties, category:Stalinism,
    category:Marxism–Leninism, category:Ideology_of_the_Communist_Party_of_China,
    category:Communist_Party_of_China, category:Maoist_parties, category:Maoist_organizations,
    category:Maoism, category:Mao_Zedong, category:20th-century_Chinese_poets,
    category:Chinese_poets_by_century, category:Chinese_poets, category:Chinese_poetry,
    category:Chinese_literature_by_medium, category:Chinese_media_by_medium,
    category:Media_in_China, category:Communications_in_China,
    category:Communications_in_Asia_by_country, category:Communications_in_Asia,
    category:Information_and_communication_technologies_in_Asia, category:Development_in_Asia,
    category:Development_by_continent
    It’s a MariaDB database, didn’t even try to write a SQL query…

    View Slide

  39. View Slide

  40. pgx> var betweenness = analyst.approximateVertexBetweennessCentrality(graph, 2000)
    betweenness ==> VertexProperty[name=approx_betweenness,type=double,graph=wikipedia]
    pgx> var resultSet = graph.queryPgql("SELECT ID(v), v.approx_betweenness MATCH (v) ORDER BY
    v.approx_betweenness DESC LIMIT 10")
    resultSet ==> PgqlResultSetImpl[graph=wikipedia,numResults=10]
    pgx> resultSet.print(10)
    +-------------------------------------------------------------------+
    | ID(v) | v.approx_betweenness |
    +-------------------------------------------------------------------+
    | category:NATO | 3335099.4280233113 |
    | category:French_language | 2528594.2608384034 |
    | category:Major_non-NATO_ally | 2439938.8292453424 |
    | category:Francophonie | 2302727.8457136396 |
    | category:Languages_written_in_Latin_script | 2195355.50705401 |
    | category:Latin_script | 2189237.6630984987 |
    | category:Postâ??Russian_Empire_states | 2046648.4079036822 |
    | category:Russian_Civil_War | 2037548.3718300906 |
    | category:Years | 2031954.4741504523 |
    | category:United_States | 1985616.4889351444 |
    +-------------------------------------------------------------------+
    $4 ==> PgqlResultSetImpl[graph=wikipedia,numResults=10]

    View Slide

  41. View Slide








  42. View Slide

  43. Graph DB
    Relational DB

    View Slide

  44. View Slide