Slide 1

Slide 1 text

Get to Know the Real World… Discovering Connected Data with a Graph Database Jennifer Reif Neo4j @JMHReif

Slide 2

Slide 2 text

Who Am I? • Developer Relations Engineer for Neo4j • Continuous learner • Conference speaker • Blogger • Hobbies: cats, coffee, traveling Email: [email protected] Twitter: @JMHReif

Slide 3

Slide 3 text

What is a Graph Database?

Slide 4

Slide 4 text

Graph Chart

Slide 5

Slide 5 text

Database - specifically graph • Database: a structured set of data held in a computer, especially one that is accessible in various ways. • Relational? NoSQL? Graph? • Graph database: uses graph structures for semantic queries with nodes, edges and properties to represent and store data.

Slide 6

Slide 6 text

Why would I choose graph?

Slide 7

Slide 7 text

The world is a graph – everything is connected • people, places, events • companies, markets • countries, history, politics • sciences, art, teaching • technology, networks, machines, 
 applications, users • software, code, dependencies, 
 architecture, deployments • criminals, fraudsters, and their behavior

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Relational Graph

Slide 10

Slide 10 text

Other NoSQL Graph

Slide 11

Slide 11 text

What is it used to accomplish? Internal Applications • Master Data Management • Network and 
 IT Operations • Fraud Detection Customer-Facing Applications • Real-Time Recommendations • Graph-Based Search • Identity and 
 Access Management

Slide 12

Slide 12 text

What is it used to accomplish? Use Cases • Social networks • Impact analysis • Logistics and routing • Recommendations • Access control • Fraud analysis • …and many, many more!

Slide 13

Slide 13 text

How do you do it?

Slide 14

Slide 14 text

Whiteboard Friendliness Easy to design and model direct representation of the model

Slide 15

Slide 15 text

Whiteboard friendliness

Slide 16

Slide 16 text

Whiteboard friendliness The Matrix Cloud Atlas Tom Hanks ACTED_IN Lana Wachowski DIRECTED DIRECTED Hugo Weaving ACTED_IN ACTED_IN

Slide 17

Slide 17 text

Whiteboard friendliness title: Cloud Atlas released: 2012 title: The Matrix released: 1999 Movie Movie name: Tom Hanks born: 1956 ACTED_IN roles: Zachry Person Actor name: Lana Wachowski born: 1965 DIRECTED DIRECTED Person Director ACTED_IN roles: Bill Smoke ACTED_IN roles: Agent Smith name: Hugo Weaving born: 1960 Person Actor

Slide 18

Slide 18 text

Whiteboard friendliness

Slide 19

Slide 19 text

Graph Data Model

Slide 20

Slide 20 text

Property Graph Data Model • 2 Main Components: • Nodes • Relationships • Additional Components: • Labels • Properties

Slide 21

Slide 21 text

Property Graph Data Model • Nodes: • Represent the objects in the graph • Can be categorized using Labels Car Person Person

Slide 22

Slide 22 text

Property Graph Data Model • Nodes: • Represent the objects in the graph • Can be categorized using Labels • Relationships: • Relate nodes by type and direction Car DRIVES LOVES LOVES LIVES WITH OW NS Person Person

Slide 23

Slide 23 text

Property Graph Data Model • Nodes: • Represent the objects in the graph • Can be categorized using Labels • Relationships: • Relate nodes by type and direction • Properties: • Name-value pairs that can be applied to nodes or relationships Car DRIVES LOVES LOVES LIVES WITH OW NS Person Person name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70”

Slide 24

Slide 24 text

Cypher Query Language…. SQL for graphs

Slide 25

Slide 25 text

Cypher: Powerful and Expressive CREATE (:Person { name:“Dan”}) -[:LOVES]-> (:Person { name:“Ann”}) LOVES Dan Ann LABEL PROPERTY NODE NODE LABEL PROPERTY

Slide 26

Slide 26 text

Cypher: Powerful and Expressive LOVES Dan Ann MATCH (:Person { name:"Dan"} ) -[:LOVES]-> ( whom ) 
 RETURN whom

Slide 27

Slide 27 text

Importing Data to Neo4j

Slide 28

Slide 28 text

Options for Importing Data • Cypher statements / script: create individual statements to load data manually. • LOAD CSV: used for small and medium data sets can import local or online csv files to graph. • ETL Tool: can import from a relational database and maps relational data model to graph. • APOC: standard library that includes several import procedures for different data formats

Slide 29

Slide 29 text

//Load Movie objects that are wanted WITH 'https://api.themoviedb.org/3/search/movie?api_key='+ $apiKey+'&query=Lord%20of%20the%20Rings' as url CALL apoc.load.json(url) YIELD value UNWIND value.results AS results WITH results MERGE (m:Movie {movieId: results.id})
 ON CREATE SET m.title = results.title, m.desc = results.overview, m.poster = results.poster_path, m.reviewStars = results.vote_average, m.reviews = results.vote_count WITH results, m CALL apoc.do.when(results.release_date = "", 'SET m.releaseDate = null', 'SET m.releaseDate = date(results.release_date)', {m:m, results:results}) YIELD value RETURN m

Slide 30

Slide 30 text

//For Movie objects just loaded, pick out trilogy and retrieve cast of those movies WITH 'https://api.themoviedb.org/3/movie/' as prefix, '/credits?api_key='+$apiKey as suffix, ["The Lord of the Rings: The Fellowship of the Ring", "The Lord of the Rings: The Two Towers", "The Lord of the Rings: The Return of the King"] as movies CALL apoc.periodic.iterate('MATCH (m:Movie) WHERE m.title IN $movies RETURN m', 'WITH m CALL apoc.load.json($prefix+m.movieId+$suffix) YIELD value UNWIND value.cast AS cast MERGE (c:Cast {id: cast.id}) ON CREATE SET c.name = cast.name MERGE (ch:Character {name: cast.character}) MERGE (ch)-[r:APPEARS_IN]->(m) MERGE (c)-[r1:PLAYED]->(ch)', {batchSize: 1, iterateList:false, params:{movies:movies, prefix:prefix, suffix:suffix}});

Slide 31

Slide 31 text

Demo Time!!

Slide 32

Slide 32 text

Resources • Neo4j download: https://neo4j.com/download/ • Neo4j sandbox: https://neo4j.com/sandbox-v2/ • Neo4j guides: https://neo4j.com/developer/get-started • Cypher: https://neo4j.com/developer/cypher/ • LOAD CSV: https://neo4j.com/developer/guide-import-csv/ • APOC: https://neo4j-contrib.github.io/neo4j-apoc-procedures/ • Neo4j Certification: https://neo4j.com/graphacademy/neo4j-certification/ @JMHReif [email protected]