Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Graph Databases + Neo4j

Graph Databases + Neo4j

A primer on graph databases, why you would use one, what they do, and one application for a graph database.

Regina Imhoff

August 07, 2017
Tweet

More Decks by Regina Imhoff

Other Decks in Programming

Transcript

  1. Graph Databases • CRUD • Create, Read, Update, Delete •

    Non-Relational (NoSQL) • Connected data • Social networks!
  2. Property Graph • Nodes • Contain properties (key-value pairs) •

    One or more labels • Relationships • Named • Directional • Can also contain properties
  3. Graph Storage • Storage • Native graph storage: optimized for

    storing and managing graphs • Non-native: serialize into relational, object-oriented, or other data store
  4. Graph Processing • Native graph processing: leverage index-free adjacency •

    Connected nodes physically point to each other • Relationships are turned into first class entities in data records at store level • Non-native graph processing: place a layer of graph on an existing database storage engine • Can’t do graph traversal as it isn’t stored as a graph
  5. Why Use a Graph Database? • Performance • Increased performance

    when working with connected data • Join-intensive query performance • Relational databases deteriorate • Graph databases remain mostly constant • Queries are localized to a portion of the graph • Even as data set gets bigger!
  6. Why Use a Graph Database? • Flexibility • Graph databases

    are additive • Can add new kinds of relationships, nodes, labels, subgraphs, etc. to existing structure • Won't disturb existing queries! • Less developer time spent on modeling domains
  7. Why Use a Graph Database? • Agility • Schema-free •

    Change the data model as you develop
  8. IRL Scenario • The manufacturing and sales of yarn •

    Applicable to most social and manufacturing! • www.ravelry.com
  9. Wool Carding User Cards Material Dyeing Material User Dyes Spinni

    ng Material Material User Spins Material Item Material User Knits Pattern Patterns User Authors User Owns
  10. ☠ RUT-ROH! ☠ • Scenario: somebody who purchased a skein

    of yarn has tested positive for anthrax! • We need to find all the people who were involved in the production of this dye lot to test them too
  11. SQL Query // Spinner SELECT users.* FROM users INNER JOIN

    spinnings ON spinnings.spinner_id = users.id INNER JOIN items ON items.material_id = spinnings.id INNER JOIN users as knitters ON knitters.id = items.knitter_id WHERE knitters.name = "Bob" UNION
  12. SQL Query // Dyer SELECT users.* FROM users INNER JOIN

    dyeings ON dyeings.dyer_id = users.id INNER JOIN spinnings ON spinnings.material_id = dyeings.id INNER JOIN items ON items.material_id = spinnings.id INNER JOIN users as knitters ON knitters.id = items.knitter_id WHERE knitters.name = "Bob" UNION
  13. SQL Query // Carder SELECT users.* FROM users INNER JOIN

    cardings ON cardings.carder_id = users.id INNER JOIN dyeings ON dyeings.material_id = cardings.id INNER JOIN spinnings ON spinnings.material_id = dyeings.id INNER JOIN items ON items.material_id = spinnings.id INNER JOIN users as knitters ON knitters.id = items.knitter_id WHERE knitters.name = "Bob"
  14. Cypher Query MATCH (p:Person)-[:OWNS]->(i:Item)-[*]-(q:Person) WHERE p.name = ‘Bob' AND i.name

    = 'socks' RETURN q Variable Node Relationship Variable length path
  15. Cypher Query MATCH (p:Person)-[:OWNS]->(i:Item)-[*]-(q:Person) WHERE p.name = ‘Bob' AND i.name

    = 'socks' RETURN q Variable Node Relationship Variable length path Type
  16. Cypher Query MATCH (p:Person)-[:OWNS]->(i:Item)-[*]-(q:Person) WHERE p.name = ‘Bob' AND i.name

    = 'socks' RETURN q Variable Node Relationship Variable length path Type Directed Relationship Relationship