Discovering the Power of Graph Databases with Python and Neo4j

Discovering the power of graph databases with Python and Neo4j
Fabio Lamanna  @fblamanna April 20th, 2018

Outline who am I? databases + graphs Neo4j Neo4j +
Python Interactions

who am I?

from py2neo import Graph  def sponsoringPyCon():  graph = Graph()  return
graph.data  ("MATCH (:LARUS)-[:SUPPORTS]->(:PyCon9) RETURN *")

databases

relational (?) databases

ways to store data relational NoSQL {key:value} column document NoSQL
(graph)

graph databases relational NoSQL {key:value} column document NoSQL (graph)

graph databases this is where connections   start to matter
NoSQL (graph)

Neo4j Neo4j reveals connections in your data

platform for connected data

glue for data

complex networks laws

matching patterns of information

is pretty fast!

Cypher

easy to build data models :Event name:”Fabio” twitter: @fblamanna name:”PyCon9”
type:”Conference” name:”Firenze” :Person :City :LOVES :TALKS_AT :LO CATED_IN

Cypher name:”Fabio” twitter: @fblamanna name:”Firenze” :Person :City :LOVES CREATE (:Person
{name:”Fabio”, twitter:…})-[:LOVES]->(:City {name:”Firenze”}) NODE NODE RELATIONSHIP LABEL PROPERTIES PROPERTY LABEL TYPE

some applications

recommendations MATCH (you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco)   WHERE id(you) = “Fabio”   RETURN
reco

paradise papers MATCH p=(o:Officer {name: "The Duchy of Lancaster"})-[*..2]-() RETURN
p Queen Elizabeth II’s private estate and portfolio

Neo4j + Python interactions Ofﬁcial Neo4j Python Bolt Driver

from neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user",
“pass”) ) with driver.session() as session: session.run( ‘CREATE (:Person {name:"Fabio"})}-[:TALKS_AT]->(:Event {name:"PyCon"})’ ) Neo4j + Python interactions Ofﬁcial Neo4j Python Bolt Driver

Neo4j + Python interactions Ofﬁcial Neo4j Python Bolt Driver from
neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user", “pass”) ) query = """\ ‘MATCH (:Person {name:”Fabio”})}-[:TALKS_AT]->(event)’ """ with driver.session() as session: df = pd.DataFrame([dict(record) for record in session.run(query)])

Data Import in Neo4j

Neo4j + Python interactions Data Import in Neo4j

Neo4j + Python interactions import pandas as pd df =
pd.read_csv(bad_file_to_import_in_neo4j.csv) # Handle NaNs  df.fillna(replacement) # Clean/replace bad chars in column names df.columns = df.columns.str.replace(names_with_no_punctuation) df.columns = [names_with_no_spaces] # Set Types - recommended df[‘column’].astype(‘int64’) # Save df.to_csv(good_file_to_import_in_neo4j.csv) Data Import in Neo4j (up to 10M records)

Neo4j + Python interactions Data Import in Neo4j (up to
10M records) LOAD CSV WITH HEADERS from good_file_to_import_in_neo4j.csv as row  CREATE (a:Person {name:row.name, twitter:row.twitter})  CREATE (b:City {name:row.cityname})  CREATE (a)-[:LOVES]->(b) name,twitter,cityname  fabio,@fblamanna,firenze  … Cypher’s LOAD CSV

Neo4j + Python interactions Data Import in Neo4j (range 100B
records) import pandas as pd df = pd.read_csv(nodes_bulk_loader.csv) # Handle NaNs  df.fillna() # Format headers for Nodes df.columns = [<column 0>:ID,<property 0>:int, …,<property n>:float,:LABEL] # Save df.to_csv(nodes_bulk_loader.csv)

records) import pandas as pd df = pd.read_csv(rels_bulk_loader.csv) # Handle NaNs  df.fillna() # Format headers for Relationships df.columns = [<column 0>:START_ID,<column n>:END_ID,…,:TYPE] # Save df.to_csv(rels_bulk_loader.csv)

records) :ID(Person-ID),name,twitter,:LABEL  01,Fabio,@fblamanna,Person  … person_nodes_bulk_loader.csv rels_bulk_loader.csv :START_ID(Person-ID),:END_ID(City-ID),:TYPE  01,01,LOVES  … :ID(City-ID),cityname,:LABEL  01,firenze,City  … city_nodes_bulk_loader.csv

records) neo4j_home$ bin/neo4j-admin import --nodes person_nodes_bulk_loader.csv --nodes city_nodes_bulk_loader.csv --relationships rels_bulk_loader.csv 

py2neo

Neo4j + Python interactions py2neo Python 2.7-3.x Neo4j 3.x $
pip install py2neo Toolkit to work with Neo4j within Python Supports pandas dataframe natively

Neo4j + Python interactions py2neo >>> from py2neo import Node,
Relationship >>> a = Node("Person", name="Fabio") >>> b = Node("City", cityname="Firenze") >>> c = Node("Event", name="PyCon9") >>> ab = Relationship(a, "LOVES", b) >>> ab (Fabio)-[:LOVES]->(Firenze) >>> ac = Relationship(a, "TALKS_AT", c) >>> ac (Fabio)-[:TALKS_AT]->(PyCon9)

Neo4j + Python interactions py2neo - Subgraphs # Union >>>
s = ab | ac {(fabio:Person {name:"Fabio"}), (firenze:City {name:"Firenze"}), (pycon9:Event {name:"PyCon9"}), (Fabio)-[:LOVES]->(Firenze), (Fabio)-[:TALKS_AT]->(PyCon9)} # Intersection >>> s = ab & ac {(fabio:Person {name:"Fabio"})}

Neo4j + Python interactions py2neo - Walkable Type >>> w
= ab + Relationship(b, "LOVES", c) (Fabio)-[:LOVES]->(Firenze)-[:LOVES]->(PyCon9)

Neo4j + Python interactions py2neo + DataFrame >>> from py2neo
import Graph >>> from pandas import DataFrame >>> graph = Graph(password="pycon") >>> graph.data("MATCH (a:Person) RETURN a.name, a.twitter") [{'a.twitter': ‘@fblamanna', 'a.name': ‘Fabio’}] >>> DataFrame("MATCH (a:Person) RETURN a.name, a.twitter") a.name a.twitter 0 Fabio @fblamanna

Pypher

Neo4j + Python interactions Pypher Express Cypher queries in pure
Python! $ pip install python_cypher

Neo4j + Python interactions Pypher - Cypher, but in Python
from pypher import Pypher q = Pypher() q.Match.node("a", labels="Person").WHERE.a.property("twitter") == "@fblamanna" q.RETURN.a MATCH (a:Person)  WHERE a.twitter = "@fblamanna" RETURN a

Neo4j + Python interactions From Python to Cypher, and back
from pypher import Pypher q = Pypher() p.Match.node('a').relationship('r').node('b').RETURN('a', 'b', 'r') >>> print(p) Cypher:  MATCH ('a')-['r']-('b') RETURN a, b, r

Summing up How to interact with Neo4j rightly within Python
Data preparation I/O operations Ofﬁcial driver + Packages (py2neo, Pypher and more to come…) Discover the potential of Neo4j with your favourite snake!

(fabio)-[:THANKS]->(PyCon9) Fabio Lamanna  @fblamanna

Discovering the Power of Graph Databases with P...

Discovering the Power of Graph Databases with Python and Neo4j

More Decks by Fabio Lamanna

Other Decks in Technology

Featured

Transcript