Slide 1

Slide 1 text

(review of graph databases) ( )— —>( )<— —( ) Graph Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS

Slide 2

Slide 2 text

Review of Graph Databases, Arturas Smorgun, 2018 G = (V, E)

Slide 3

Slide 3 text

Review of Graph Databases, Arturas Smorgun, 2018 Vertices G = (V, E) V

Slide 4

Slide 4 text

Review of Graph Databases, Arturas Smorgun, 2018 Vertices G = (V, E) V V V V V V

Slide 5

Slide 5 text

Review of Graph Databases, Arturas Smorgun, 2018 Edges G = (V, E) V V V V V V E

Slide 6

Slide 6 text

Review of Graph Databases, Arturas Smorgun, 2018 Edges G = (V, E) V V V V V V E E E E E

Slide 7

Slide 7 text

Review of Graph Databases, Arturas Smorgun, 2018

Slide 8

Slide 8 text

Review of Graph Databases, Arturas Smorgun, 2018 Domain: encyclopedia • Vertices: • Author • Page • Edges: • Page relates to other page • Author contributes to a page (authors, edits, reviews)

Slide 9

Slide 9 text

Review of Graph Databases, Arturas Smorgun, 2018 Encyclopedia: model Author Page Contributes Author Page Likes Links to

Slide 10

Slide 10 text

Review of Graph Databases, Arturas Smorgun, 2018 Encyclopedia: graph

Slide 11

Slide 11 text

Review of Graph Databases, Arturas Smorgun, 2018 Graph Algorithms FTW!!

Slide 12

Slide 12 text

Review of Graph Databases, Arturas Smorgun, 2018 Graph Algorithms FTW!! Breadth First Search Depth First Search Shortest Path Minimum Spanning Tree Maximum Flow Connectivity … …

Slide 13

Slide 13 text

Review of Graph Databases, Arturas Smorgun, 2018 Example: Page Rank

Slide 14

Slide 14 text

Review of Graph Databases, Arturas Smorgun, 2018 Types of Graph Solutions Graph Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS

Slide 15

Slide 15 text

Review of Graph Databases, Arturas Smorgun, 2018 Resource Description Framework (RDF) • W3C Recommendation • Open Source • since 1999 • Information Exchange • Semantic Web

Slide 16

Slide 16 text

Review of Graph Databases, Arturas Smorgun, 2018 Labeled Property Graph (LPG) • Not sure who was first • Some open source • Some proprietary • Emphasis on querying and traversal

Slide 17

Slide 17 text

Review of Graph Databases, Arturas Smorgun, 2018 RDF • Vertice = Resource • Edge = Relationship • Qualify relationships: NO • Identify relationship: NO • Triple: subject, predicate, object • Data mobility: YES • Verifiability: YES • Vertice = Node • Edge = Relationship • Qualify relationships: YES • Identify relationship: YES • No strictly defined structure • Data mobility: Maybe? • Verifiability: No? LPG

Slide 18

Slide 18 text

Review of Graph Databases, Arturas Smorgun, 2018 RDF LPG

Slide 19

Slide 19 text

Review of Graph Databases, Arturas Smorgun, 2018 Graph DB vs Relational DB • Good for recursive loose schema • Vertices for entities • Edges for relationships • Easy to add new entity types • Easy to add new relationships • Easy to traverse relationships • Good for flat strict schema • Tables for entities • Tables or FK for relationships • Easy to add new entity types • Easy to add new relationships • Hard to traverse relationships

Slide 20

Slide 20 text

Review of Graph Databases, Arturas Smorgun, 2018 Graph DB vs Document DB • Loose schema with nester relations • Vertices for entities • Edges for relationships • Easy to add new entity types • Easy to add new relationships • Easy to traverse relationships • Loose schema with duplications • Document with all relations • Document per use case • Easy to add new entity types • Hard to add new relationships • Hard to traverse relationships

Slide 21

Slide 21 text

Review of Graph Databases, Arturas Smorgun, 2018 Query Languages Graph Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS

Slide 22

Slide 22 text

Review of Graph Databases, Arturas Smorgun, 2018 SPARQL • For RDF Graphs • W3C Recommendation • widely used • not an xml, yay (the RDF documents can be xml) PREFIX foaf: SELECT ?name WHERE { ?person foaf:name ?name . }

Slide 23

Slide 23 text

Review of Graph Databases, Arturas Smorgun, 2018 Cypher • For Labeled Property Graphs • created by Neo4j • open sourced and widely available now • “SQL of Graph Databases” (striving to be) MATCH (node1:Label1)-[rel]!->(node2:Label2) WHERE node1.propertyA = {value} RETURN node2.propertyA, node2.propertyB

Slide 24

Slide 24 text

Review of Graph Databases, Arturas Smorgun, 2018 Gremlin • For Labeled Property Graphs • Apache Project • open sourced and used for many different databases and processors • very similar to Cypher, but I found tooling and documentation poorer g.V().has("name","gremlin"). out("knows"). out("knows"). values("name")

Slide 25

Slide 25 text

Review of Graph Databases, Arturas Smorgun, 2018 GraphQL • NOT graph db query language • started by Facebook • aimed at api communication • very useful to query existing data { hero { name } }

Slide 26

Slide 26 text

Review of Graph Databases, Arturas Smorgun, 2018 SPARQL vs Cypher

Slide 27

Slide 27 text

Review of Graph Databases, Arturas Smorgun, 2018 Encyclopedia: graph

Slide 28

Slide 28 text

Review of Graph Databases, Arturas Smorgun, 2018 Insert in SPARQL (RDF XML) vs Cypher David Smith! ! ! ! ! Jack Jones! ! ! ! ! James James! ! ! City! ! Owl! ! Branch! ! ! Some! ! ! CREATE (a1:AUTHOR {name: "David Smith"}) CREATE (a2:AUTHOR {name: "Jack Jones"}) CREATE (a3:AUTHOR {name: "James James"}) CREATE (p1:PAGE {name: "City"}) CREATE (p2:PAGE {name: "Owl"}) CREATE (p3:PAGE {name: "Branch"}) CREATE (p4:PAGE {name: "Some"}) CREATE (a1)-[:AUTHORED]!->(p1) CREATE (a1)-[:AUTHORED]!->(p2) CREATE (a1)-[:AUTHORED]!->(p4) CREATE (a2)-[:EDITED]!->(p2) CREATE (a2)-[:AUTHORED]!->(p3) CREATE (a2)-[:REVIEWED]!->(p1) CREATE (a3)-[:EDITED]!->(p3) CREATE (p3)-[:RELATED_TO]!->(p4)

Slide 29

Slide 29 text

Review of Graph Databases, Arturas Smorgun, 2018 Insert in SPARQL (Turtle) vs Cypher @prefix x: INSERT DATA { x:name: "David Smith" x:authored: x:authored: x:authored: x:name: "Jack Jones" x:authored: x:edited: x:reviewed: x:name: "James James" x:authored: x:name: "City" x:name: "Owl" x:name: "Branch" x:related_to: x:name: "Some" CREATE (a1:AUTHOR {name: "David Smith"}) CREATE (a2:AUTHOR {name: "Jack Jones"}) CREATE (a3:AUTHOR {name: "James James"}) CREATE (p1:PAGE {name: "City"}) CREATE (p2:PAGE {name: "Owl"}) CREATE (p3:PAGE {name: "Branch"}) CREATE (p4:PAGE {name: "Some"}) CREATE (a1)-[:AUTHORED]!->(p1) CREATE (a1)-[:AUTHORED]!->(p2) CREATE (a1)-[:AUTHORED]!->(p4) CREATE (a2)-[:EDITED]!->(p2) CREATE (a2)-[:AUTHORED]!->(p3) CREATE (a2)-[:REVIEWED]!->(p1) CREATE (a3)-[:EDITED]!->(p3) CREATE (p3)-[:RELATED_TO]!->(p4)

Slide 30

Slide 30 text

Review of Graph Databases, Arturas Smorgun, 2018 Select in SPARQL vs Cypher !// select all @prefix x: SELECT DISTINCT ?g WHERE { GRAPH ?g { ?s ?p ?o } } !// select ordered authors @prefix x: SELECT { ?g WHERE { GRAPH ?g { ?s x:authored ?o } }. ?s x:name ?name. } ORDER BY ?name !// select all MATCH (n1)-[r]!->(n2) RETURN r, n1, n2 !// select ordered authors MATCH (author)-[r:AUTHORED]!->(page) RETURN author, r, page ORDER BY author.name

Slide 31

Slide 31 text

Review of Graph Databases, Arturas Smorgun, 2018 Update in SPARQL vs Cypher @prefix x: DELETE { x:name ?o}
 INSERT { x:name “John Smith”} WHERE { x:name ?o MATCH (a:AUTHOR {name: “John Smith”) SET c.name = "David Smith" RETURN c

Slide 32

Slide 32 text

Review of Graph Databases, Arturas Smorgun, 2018 Select all related pages in SPARQL vs Cypher !// not sure … !// update graph MATCH (p:PAGE {name: "Some"}) CREATE (p5:PAGE {name: "Some More"}) CREATE (p5)-[:RELATED_TO]!->(p) !// select MATCH (a:AUTHOR {name: "Jack Jones"})-[:AUTHORED]! ->(ap: PAGE), (ap)-[:RELATED_TO*1!..3]-(other:PAGE) RETURN a, ap, other

Slide 33

Slide 33 text

Review of Graph Databases, Arturas Smorgun, 2018 Ready solutions on AWS Graph Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS

Slide 34

Slide 34 text

Review of Graph Databases, Arturas Smorgun, 2018 Amazon Neptune and Neo4j Enterprise

Slide 35

Slide 35 text

Review of Graph Databases, Arturas Smorgun, 2018 Amazon Neptune • AWS Native Service • Managed • LPG & RDF • Query languages: SPARQL, Gremlin (can do Cypher) • ACID: YES • Limitation: 64TB of data • Proprietary and vendor locked (but compatible) • Do not have dev env (use compatible services) • AWS Marketplace Solution • Hosted (support can be bought) • LPG (RDF via community plugin) • Query language: Cypher • ACID: YES? • Limitation: 34.4B of nodes • Open source and graph native • Have dev env (native or docker) Neo4j Enterprise

Slide 36

Slide 36 text

Review of Graph Databases, Arturas Smorgun, 2018 HA: Amazon Neptune vs Neo4j Enterprise • Master-Slave • Automatic backups to s3 • Replicated across AZ • Failover in 30 seconds • Causal Graph

Slide 37

Slide 37 text

Review of Graph Databases, Arturas Smorgun, 2018 Internals: Amazon Neptune vs Neo4j Enterprise • ¯\_(ツ)_/¯ • Graph Native

Slide 38

Slide 38 text

Review of Graph Databases, Arturas Smorgun, 2018 Cost: Amazon Neptune vs Neo4j Enterprise • Pay as you go, no upfront cost • $0.30..$5.50 per hour (compute) • $0.10 per month (per GB store) • $0.20 per 1 million of requests • backups, replication, disaster recovery included • No upfront cost (check license) • pay for compute resource used • pay for storage used • pay for network traffic • commercial support is very expensive (reportedly ~$200k), need devops to maintain

Slide 39

Slide 39 text

Review of Graph Databases, Arturas Smorgun, 2018 (CYPHER)-[:IS]->(GREAT) Neo4j has awesome tools to start quickly https://neo4j.com/download/ https://neo4j.com/sandbox-v2/