Slide 1

Slide 1 text

Photo by Shannon Potter on Unsplash Jennifer Reif Developer Advocate, Neo4j [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif Connect the Tables: How to Graph Relational Databases

Slide 2

Slide 2 text

Who is Jennifer Reif? Developer Advocate, Neo4j • Continuous learner • Conference speaker • Creator: Tech blogger + podcaster • Other: geek Jennifer Reif [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif

Slide 3

Slide 3 text

Data set

Slide 4

Slide 4 text

Kaggle - public data Co ff ee Shop https://www.kaggle.com/datasets/ylchang/co ff ee-shop-sample-data-1113 https://github.com/JMHReif/graphing-relational-dbs-data-import/tree/main Cleaned —>>

Slide 5

Slide 5 text

Demo 1 - Look at CSVs Live! • customer.csv • order.csv • product.csv • sales_outlet.csv • sta ff .csv • product_type.csv

Slide 6

Slide 6 text

Demo 1 - Model it Live! https://arrows.app

Slide 7

Slide 7 text

Relational

Slide 8

Slide 8 text

Data set sample (core entities) Order Product Customer Building Sta f

Slide 9

Slide 9 text

Data Normalization 3rd Normal Form (3NF) • Unique row, low data duplication • Primary key or composite key (where necessary) • Foreign keys to link tables • Many-to-many relationships -> JOIN tables • JOINs on query-time, from expert knowledge https://dzone.com/articles/do-we-need-data-normalization-anymore

Slide 10

Slide 10 text

Relational ERD Co ff ee Shop

Slide 11

Slide 11 text

Graph

Slide 12

Slide 12 text

What is a graph? Connected data • Store relationships -> separate entities • JOINs on write, not read • Flexible schema • Query language: Cypher

Slide 13

Slide 13 text

Still get core features No FOMO here :) • Physical storage • Transactional + analytical workloads • ACID transactions • Indexes / constraints • Security features • Cloud hosting • Free -> enterprise solutions

Slide 14

Slide 14 text

Graph components Graph theory foundations • Node (vertex) • Relationship (edge)

Slide 15

Slide 15 text

Nodes Graph components • Represent objects or entities • Can be labeled • May have properties Order Product Employee orderId: 162468 orderDate: 2019-04-01 productId: 08746589 name: “Ipoh Coffee” id: 247924 startDate: 2016-05-04 position: “Barista”

Slide 16

Slide 16 text

Relationships Graph components • Must have a type • Must have a direction • May have properties • Nodes can have multiple Order Product Employee orderId: 162468 orderDate: 2019-04-01 productId: 08746589 name: “Ipoh Coffee” id: 247924 startDate: 2016-05-04 position: “Barista” CONTAINS date: 2022-08-16 tip: 1.00 SOLD CONTACT CREATED

Slide 17

Slide 17 text

Label Graph components • A group of nodes • Like a category Person Employee Customer

Slide 18

Slide 18 text

Property Graph components • Key-value pair • Associated with node or relationship • Like a variable name and value Order Employee orderId: 162468 id: 247924 SOLD orderDate: 2019-04-01 date: 2022-08-16 tip: 1.00 startDate: 2016-05-04 position: “Barista”

Slide 19

Slide 19 text

Naming conventions Relational vs Graph • Recommendations vary! • Lowercase • Separator: _ • Avoid keywords • Capitalize SQL keywords • Might change with GQL standard • camelCase • UPPERCASE • Separator: _ • Avoid keywords • Capitalize Cypher keywords https://brainstation.io/learn/sql/naming-conventions https://neo4j.com/docs/cypher-manual/current/syntax/naming/ SQL Cypher

Slide 20

Slide 20 text

Demo 2 - Translate the model

Slide 21

Slide 21 text

Relational -> Graph Data model • Tables -> Labels • Row = Node • FK = Relationship • JOIN table = Relationship

Slide 22

Slide 22 text

• Tables = Labels • Row = Node • FK = Relationship • JOIN table = Relationship https://arrows.app

Slide 23

Slide 23 text

• Tables = Labels • Row = Node • FK = Relationship • JOIN table = Relationship

Slide 24

Slide 24 text

• Relationship direction • Type/Name • No right or wrong! • Sensible for your model

Slide 25

Slide 25 text

Modeling decisions

Slide 26

Slide 26 text

What to do… When things are not cut and dry? • Property on node or relationship? • Promo • Property or node? • Line item • More than 1 label? • Seasonal drink

Slide 27

Slide 27 text

- Every data modeler ever “It depends…”

Slide 28

Slide 28 text

Tips and Tricks General architecture • De fi ne use case -> • Know what problem you’re solving • Outline key queries -> • Optimize model to answer those • Limitations are di ff erent • More entities <> bad • Model can change as-needed

Slide 29

Slide 29 text

Tips and Tricks General practices • Frequent fi lter -> • Opt for separate node/rel • Props with few/unchanging values -> • Consider label • Dense nodes (hypernodes) -> • Consider more relationship types to fi lter

Slide 30

Slide 30 text

Data export / import

Slide 31

Slide 31 text

MariaDB Export / Import • SQL dump fi le (entire db) • SQL statement to CSV

Slide 32

Slide 32 text

Neo4j Export / Import • Cypher + APOC • ETL Tool • Dump fi le (beware)

Slide 33

Slide 33 text

Other options for both Export + Import + Sync • Driver • Messaging platform • Scheduled task

Slide 34

Slide 34 text

Demo 3 - Import to Neo4j

Slide 35

Slide 35 text

Setup Load components • Neo4j Docker container • Shell script -> cypher-shell • Cypher script • LOAD CSV

Slide 36

Slide 36 text

Draft data model = real-world model Recap!

Slide 37

Slide 37 text

Draft = Graph model Co ff ee Shop

Slide 38

Slide 38 text

All or nothing?

Slide 39

Slide 39 text

Polyglot Graphy data -> graph db • Store data with complicated JOINs in a graph • E.g. Customer -> Order -> Product • Eval: • JOINs • Frequency • Performance criticality

Slide 40

Slide 40 text

Cypher

Slide 41

Slide 41 text

Cypher • Functional and visual • Based on ASCII-art • Declarative query language • Focuses on what to retrieve • Not how A B LIKES MATCH (A)-[:LIKES]->(B)

Slide 42

Slide 42 text

Cypher: powerful and expressive Jennifer Neo4j WORKS_FOR CREATE (:Person { name: ‘Jennifer’}) -[:WORKS_FOR]-> (:Company { name: ‘Neo4j’}) NODE PROPERTY NODE PROPERTY LABEL LABEL

Slide 43

Slide 43 text

Cypher: read Jennifer Neo4j WORKS_FOR MATCH (:Person { name: ‘Jennifer’} ) -[:WORKS_FOR]-> ( whom ) RETURN whom

Slide 44

Slide 44 text

Translating queries • SELECT • Property list • FROM • JOIN ON • WHERE • GROUP BY • HAVING • MATCH • RETURN entities/props • Pattern (node)-[rel]-(node2) • -[var:REL_TYPE]-> • WHERE • Implicit in RETURN • WHERE on aggregations SQL Cypher

Slide 45

Slide 45 text

Demo 4 - SQL vs Cypher

Slide 46

Slide 46 text

Refactoring

Slide 47

Slide 47 text

Relational vs Graph Designed di ff erently • DDL / DML • Strict schema + data types • Structure-enforced • Data integrity • Storage/query optimization • Changes = dependency impacts • Schema- fl exible • Restrictions set as-needed • Governance-enforced • Data integrity • Storage/query optimization • Changes = less cascading impacts Relational Graph

Slide 48

Slide 48 text

This means… Refactoring for graphs • Streamlined • Faster • More responsive • Less invasive

Slide 49

Slide 49 text

Tools For refactoring • Cypher + APOC • Neo4j Migrations (https://neo4j.com/labs/neo4j-migrations/) • 3rd party tools (Kettle, etc)

Slide 50

Slide 50 text

But wait…there’s more!

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

https://dev.neo4j.com/nodes24

Slide 53

Slide 53 text

Resources Questions? • Github repo: github.com/JMHReif/connect-the-tables • Modeling (Arrows): arrows.app • SQL / Cypher: neo4j.com/docs/getting-started/cypher-intro/cypher-sql/ • GraphAcademy: graphacademy.neo4j.com/ Jennifer Reif Developer Advocate, Neo4j [email protected] @JMHReif github.com/JMHReif jmhreif.com linkedin.com/in/jmhreif