Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Traversing the land of graph computing and databases

Traversing the land of graph computing and databases

Graphs have long held a special place in the computer science’s history (and codebases). With the advent of a new wave of the information age characterized by a greater emphasis on linked data, graph computing and databases have risen to prominence. Be it enterprise knowledge graphs or graph-based analytics, there are a great number of potential applications.

To reap the benefits of graph databases and computing, one needs to understand the basics as well as current technical landscape and offerings. Also, it’s important to understand if a graph-based approach suits your problem.
This talk touched upon the above points.

Akash Tandon

May 03, 2019
Tweet

More Decks by Akash Tandon

Other Decks in Programming

Transcript

  1. Traversing the land of graph computing and databases Akash Tandon

    Data engineer and aficionado PyCon Italy (@pyconit)
  2. Why are you here? - Understand graphs as an elegant

    representation of data - Graph theory - standing on the shoulders of giants - Recent rise of graph tech
  3. MATCH (person)-[:BORN_IN]->()-[:WITHIN*0..]->(us:Location {name:'United States'}), (person)-[:LIVES_IN]->()-[:WITHIN*0..]->(eu:Location {name:'Europe'}) RETURN person.name WITH RECURSIVE

    -- in_usa is the set of vertex IDs of all locations within the United States in_usa(vertex_id) AS (SELECT vertex_id FROM vertices WHERE properties->>'name' = 'United States' UNION SELECT edges.tail_vertex FROM edges JOIN in_usa ON edges.head_vertex = in_usa.vertex_id WHERE edges.label = 'within' ), -- in_europe is the set of vertex IDs of all locations within Europe in_europe(vertex_id) AS ( SELECT vertex_id FROM vertices WHERE properties->>'name' = 'Europe' UNION SELECT edges.tail_vertex FROM edges JOIN in_europe ON edges.head_vertex = in_europe.vertex_id WHERE edges.label = 'within' ), -- born_in_usa is the set of vertex IDs of all people born in the US born_in_usa(vertex_id) AS ( SELECT edges.tail_vertex FROM edges JOIN in_usa ON edges.head_vertex = in_usa.vertex_id WHERE edges.label = 'born_ina), - lives_in_europe is the set of vertex IDs of all people living in Europe lives_in_europe(vertex_id) AS ( SELECT edges.tail_vertex FROM edges JOIN in_europe ON edges.head_vertex = in_europe.vertex_idWHERE edges.label ='lives_in') SELECT vertices.properties->>'name' FROM vertices -- join to find those people who were both born in the US *and* live in Europe JOIN born_in_usa ON vertices.vertex_id = born_in_usa.vertex_id JOIN lives_in_europe ON vertices.vertex_id = lives_in_europe.vertex_id; Cypher versus SQL (Martin Kleppmann, Designing Data Intensive Applications, 2017)
  4. Use-cases across domains - Knowledge graphs - Recommendation engines -

    Social networks - Privacy and compliance - Data integration and master data management Source: Graph database use-cases
  5. When not to use graphs? Depends on the use-case but

    some situations can include: - Disconnected data; relationships don’t matter - Data model is consistent and fixed - Bulk scans instead of starting from a point
  6. Resources - Neo4j - Py2neo tutorial - Apache Tinkerpop -

    Networkx - Awesome-graph list (Github) - WTF is a knowledge graph?