Slide 1

Slide 1 text

Discovering the power of graph databases with Python and Neo4j Fabio Lamanna
 @fblamanna April 20th, 2018

Slide 2

Slide 2 text

Outline who am I? databases + graphs Neo4j Neo4j + Python Interactions

Slide 3

Slide 3 text

who am I?

Slide 4

Slide 4 text

from py2neo import Graph
 def sponsoringPyCon():
 graph = Graph()
 return graph.data
 ("MATCH (:LARUS)-[:SUPPORTS]->(:PyCon9) RETURN *")

Slide 5

Slide 5 text

databases

Slide 6

Slide 6 text

relational (?) databases

Slide 7

Slide 7 text

ways to store data relational NoSQL {key:value} column document NoSQL (graph)

Slide 8

Slide 8 text

graph databases relational NoSQL {key:value} column document NoSQL (graph)

Slide 9

Slide 9 text

graph databases this is where connections 
 start to matter NoSQL (graph)

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Neo4j Neo4j reveals connections in your data

Slide 12

Slide 12 text

platform for connected data

Slide 13

Slide 13 text

glue for data

Slide 14

Slide 14 text

complex networks laws

Slide 15

Slide 15 text

matching patterns of information

Slide 16

Slide 16 text

is pretty fast!

Slide 17

Slide 17 text

Cypher

Slide 18

Slide 18 text

easy to build data models :Event name:”Fabio” twitter: @fblamanna name:”PyCon9” type:”Conference” name:”Firenze” :Person :City :LOVES :TALKS_AT :LO CATED_IN

Slide 19

Slide 19 text

Cypher name:”Fabio” twitter: @fblamanna name:”Firenze” :Person :City :LOVES CREATE (:Person {name:”Fabio”, twitter:…})-[:LOVES]->(:City {name:”Firenze”}) NODE NODE RELATIONSHIP LABEL PROPERTIES PROPERTY LABEL TYPE

Slide 20

Slide 20 text

some applications

Slide 21

Slide 21 text

recommendations MATCH (you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco) 
 WHERE id(you) = “Fabio” 
 RETURN reco

Slide 22

Slide 22 text

paradise papers MATCH p=(o:Officer {name: "The Duchy of Lancaster"})-[*..2]-() RETURN p Queen Elizabeth II’s private estate and portfolio

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

Neo4j + Python interactions Official Neo4j Python Bolt Driver

Slide 25

Slide 25 text

from neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user", “pass”) ) with driver.session() as session: session.run( ‘CREATE (:Person {name:"Fabio"})}-[:TALKS_AT]->(:Event {name:"PyCon"})’ ) Neo4j + Python interactions Official Neo4j Python Bolt Driver

Slide 26

Slide 26 text

Neo4j + Python interactions Official Neo4j Python Bolt Driver from neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user", “pass”) ) query = """\ ‘MATCH (:Person {name:”Fabio”})}-[:TALKS_AT]->(event)’ """ with driver.session() as session: df = pd.DataFrame([dict(record) for record in session.run(query)])

Slide 27

Slide 27 text

Data Import in Neo4j

Slide 28

Slide 28 text

Neo4j + Python interactions Data Import in Neo4j

Slide 29

Slide 29 text

Neo4j + Python interactions import pandas as pd df = pd.read_csv(bad_file_to_import_in_neo4j.csv) # Handle NaNs
 df.fillna(replacement) # Clean/replace bad chars in column names df.columns = df.columns.str.replace(names_with_no_punctuation) df.columns = [names_with_no_spaces] # Set Types - recommended df[‘column’].astype(‘int64’) # Save df.to_csv(good_file_to_import_in_neo4j.csv) Data Import in Neo4j (up to 10M records)

Slide 30

Slide 30 text

Neo4j + Python interactions Data Import in Neo4j (up to 10M records) LOAD CSV WITH HEADERS from good_file_to_import_in_neo4j.csv as row
 CREATE (a:Person {name:row.name, twitter:row.twitter})
 CREATE (b:City {name:row.cityname})
 CREATE (a)-[:LOVES]->(b) name,twitter,cityname
 fabio,@fblamanna,firenze
 … Cypher’s LOAD CSV

Slide 31

Slide 31 text

Neo4j + Python interactions Data Import in Neo4j (range 100B records) import pandas as pd df = pd.read_csv(nodes_bulk_loader.csv) # Handle NaNs
 df.fillna() # Format headers for Nodes df.columns = [:ID,:int, …,:float,:LABEL] # Save df.to_csv(nodes_bulk_loader.csv)

Slide 32

Slide 32 text

Neo4j + Python interactions Data Import in Neo4j (range 100B records) import pandas as pd df = pd.read_csv(rels_bulk_loader.csv) # Handle NaNs
 df.fillna() # Format headers for Relationships df.columns = [:START_ID,:END_ID,…,:TYPE] # Save df.to_csv(rels_bulk_loader.csv)

Slide 33

Slide 33 text

Neo4j + Python interactions Data Import in Neo4j (range 100B records) :ID(Person-ID),name,twitter,:LABEL
 01,Fabio,@fblamanna,Person
 … person_nodes_bulk_loader.csv rels_bulk_loader.csv :START_ID(Person-ID),:END_ID(City-ID),:TYPE
 01,01,LOVES
 … :ID(City-ID),cityname,:LABEL
 01,firenze,City
 … city_nodes_bulk_loader.csv

Slide 34

Slide 34 text

Neo4j + Python interactions Data Import in Neo4j (range 100B records) neo4j_home$ bin/neo4j-admin import --nodes person_nodes_bulk_loader.csv --nodes city_nodes_bulk_loader.csv --relationships rels_bulk_loader.csv


Slide 35

Slide 35 text

py2neo

Slide 36

Slide 36 text

Neo4j + Python interactions py2neo Python 2.7-3.x Neo4j 3.x $ pip install py2neo Toolkit to work with Neo4j within Python Supports pandas dataframe natively

Slide 37

Slide 37 text

Neo4j + Python interactions py2neo >>> from py2neo import Node, Relationship >>> a = Node("Person", name="Fabio") >>> b = Node("City", cityname="Firenze") >>> c = Node("Event", name="PyCon9") >>> ab = Relationship(a, "LOVES", b) >>> ab (Fabio)-[:LOVES]->(Firenze) >>> ac = Relationship(a, "TALKS_AT", c) >>> ac (Fabio)-[:TALKS_AT]->(PyCon9)

Slide 38

Slide 38 text

Neo4j + Python interactions py2neo - Subgraphs # Union >>> s = ab | ac {(fabio:Person {name:"Fabio"}), (firenze:City {name:"Firenze"}), (pycon9:Event {name:"PyCon9"}), (Fabio)-[:LOVES]->(Firenze), (Fabio)-[:TALKS_AT]->(PyCon9)} # Intersection >>> s = ab & ac {(fabio:Person {name:"Fabio"})}

Slide 39

Slide 39 text

Neo4j + Python interactions py2neo - Walkable Type >>> w = ab + Relationship(b, "LOVES", c) (Fabio)-[:LOVES]->(Firenze)-[:LOVES]->(PyCon9)

Slide 40

Slide 40 text

Neo4j + Python interactions py2neo + DataFrame >>> from py2neo import Graph >>> from pandas import DataFrame >>> graph = Graph(password="pycon") >>> graph.data("MATCH (a:Person) RETURN a.name, a.twitter") [{'a.twitter': ‘@fblamanna', 'a.name': ‘Fabio’}] >>> DataFrame("MATCH (a:Person) RETURN a.name, a.twitter") a.name a.twitter 0 Fabio @fblamanna

Slide 41

Slide 41 text

Pypher

Slide 42

Slide 42 text

Neo4j + Python interactions Pypher Express Cypher queries in pure Python! $ pip install python_cypher

Slide 43

Slide 43 text

Neo4j + Python interactions Pypher - Cypher, but in Python from pypher import Pypher q = Pypher() q.Match.node("a", labels="Person").WHERE.a.property("twitter") == "@fblamanna" q.RETURN.a MATCH (a:Person)
 WHERE a.twitter = "@fblamanna" RETURN a

Slide 44

Slide 44 text

Neo4j + Python interactions From Python to Cypher, and back from pypher import Pypher q = Pypher() p.Match.node('a').relationship('r').node('b').RETURN('a', 'b', 'r') >>> print(p) Cypher:
 MATCH ('a')-['r']-('b') RETURN a, b, r

Slide 45

Slide 45 text

Summing up How to interact with Neo4j rightly within Python Data preparation I/O operations Official driver + Packages (py2neo, Pypher and more to come…) Discover the potential of Neo4j with your favourite snake!

Slide 46

Slide 46 text

(fabio)-[:THANKS]->(PyCon9) Fabio Lamanna
 @fblamanna