Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Graph Databases From A Developer's Perspective

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Graph Databases From A Developer's Perspective

Avatar for Mitch Gordon

Mitch Gordon

August 25, 2013
Tweet

Other Decks in Programming

Transcript

  1. What is a Graph Database? “A graph database is a

    storage engine which supports a graph data model backed by native graph persistence, with access and query methods for the graph primitives to make querying the stored data pleasant and performant.”
  2. NoSQL – Document Store » Fast Reads » Scalable »

    Support for relations between documents is typically handled externally by map / reduce (Hadoop)
  3. NoSQL – Key Value » Very Scaleable and disaster- resistant

    (data duplicated across nodes) » Reads are by key only » Relationships between values are typically supported by external, map / reduce (Hadoop)
  4. NoSQL – Big Table » Based on subdivision of data

    where rows contain columns which can contain further columns. » Fast writes. » Some relation support primarily based on location of data (making the layout of the data the key).
  5. Relational » Read speed depends upon the data schema and

    indexes (must design schema in accordance with query strategy). » Relationships supported but can be at a high cost (join pain).
  6. Relational - Example Bob’s friends SELECT PersonFriend.friend FROM Person JOIN

    PersonFriend ON Person.person = PersonFriend.person WHERE Person.person = 'Bob‘ Alice’s friends-of-friends SELECT pf1.person as PERSON, pf3.person as FRIEND_OF_FRIEND FROM PersonFriend pf1 INNER JOIN Person ON pf1.person = Person.person INNER JOIN PersonFriend pf2 ON pf1.friend = pf2.person INNER JOIN PersonFriend pf3 ON pf2.friend = pf3.person WHERE pf1.person = 'Alice' AND pf3.person <> 'Alice'
  7. Graph Database Performance vs. Relational Depth RDMS Execution Time (S)

    Neo4j Execution Time (S) Records Returned 2 0.016 0.01 ~2500 3 30.267 0.168 ~110,000 4 1543.505 1.359 ~600, 000 5 Unfinished 2.132 ~800, 000 * For a social network containing 1,000,000 people each with approximately 50 friends.
  8. Graph Databases fit when… » Data is connected by many

    relationships » Data is unbalanced (Full data set is not always available)
  9. Consider Graph Instead of Relational When… » Deeply hierarchical data

    schema » Tables have more than a couple of foreign keys » You find yourself using more than one alias for the same table in a single query » You use more than 4 or 5 joins in a single query
  10. Trinity – Microsoft Research » Microsoft’s graph database implementation »

    Used as part of Bing’s infrastructure » http://research.microsoft.com/en-us/projects/trinity/
  11. Neo4j » Created by Neo Technology » Community Edition available

    for free » Available at www.neo4j.org and GitHub » Uses REST API or Java Native API
  12. Cypher » Proprietary Query language that can be used with

    Neo4j. START = {starting node set} MATCH {pattern} WHERE {conditions} RETURN {projection}
  13. Cypher Samples Mary bought what? START mary=node:node_auto_index(FirstName = "Mary") MATCH

    (mary)-[:PURCHASED]->(orders)-[:CONTAINS]->(detail)-[:SPECIFIES]->(inv) RETURN inv.SKU, inv.Description, inv.Price, detail.Quantity SELECT inv.SKU, inv.Description, inv.Price, od.Quantity FROM Inventory inv join orderdetail od on od.InventoryId = inv.inventoryid join [order] o on o.orderid = od.orderid join customer c on c.customerId = o.customerid WHERE c.FirstName = 'Mary'
  14. Cypher Samples How many hammers has Mary bought START mary=node:node_auto_index(FirstName

    = "Mary") MATCH (mary)-[:PURCHASED]->(orders)-[:CONTAINS]->(detail)-[:SPECIFIES]->(inv) WHERE inv.SKU = "BR549" RETURN SUM(detail.Quantity) SELECT sum(od.Quantity) FROM Inventory inv join orderdetail od on od.InventoryId = inv.inventoryid join [order] o on o.orderid = od.orderid join customer c on c.customerId = o.customerid WHERE c.FirstName = 'Mary' AND inv.SKU = 'BR549'