Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Discovering the Power of Graph Databases with P...

Discovering the Power of Graph Databases with Python and Neo4j

Talk at PyCon Nove, April 20th, 2018

Fabio Lamanna

April 21, 2018
Tweet

More Decks by Fabio Lamanna

Other Decks in Technology

Transcript

  1. Discovering the power of graph databases with Python and Neo4j

    Fabio Lamanna
 @fblamanna April 20th, 2018
  2. from py2neo import Graph
 def sponsoringPyCon():
 graph = Graph()
 return

    graph.data
 ("MATCH (:LARUS)-[:SUPPORTS]->(:PyCon9) RETURN *")
  3. easy to build data models :Event name:”Fabio” twitter: @fblamanna name:”PyCon9”

    type:”Conference” name:”Firenze” :Person :City :LOVES :TALKS_AT :LO CATED_IN
  4. Cypher name:”Fabio” twitter: @fblamanna name:”Firenze” :Person :City :LOVES CREATE (:Person

    {name:”Fabio”, twitter:…})-[:LOVES]->(:City {name:”Firenze”}) NODE NODE RELATIONSHIP LABEL PROPERTIES PROPERTY LABEL TYPE
  5. from neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user",

    “pass”) ) with driver.session() as session: session.run( ‘CREATE (:Person {name:"Fabio"})}-[:TALKS_AT]->(:Event {name:"PyCon"})’ ) Neo4j + Python interactions Official Neo4j Python Bolt Driver
  6. Neo4j + Python interactions Official Neo4j Python Bolt Driver from

    neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user", “pass”) ) query = """\ ‘MATCH (:Person {name:”Fabio”})}-[:TALKS_AT]->(event)’ """ with driver.session() as session: df = pd.DataFrame([dict(record) for record in session.run(query)])
  7. Neo4j + Python interactions import pandas as pd df =

    pd.read_csv(bad_file_to_import_in_neo4j.csv) # Handle NaNs
 df.fillna(replacement) # Clean/replace bad chars in column names df.columns = df.columns.str.replace(names_with_no_punctuation) df.columns = [names_with_no_spaces] # Set Types - recommended df[‘column’].astype(‘int64’) # Save df.to_csv(good_file_to_import_in_neo4j.csv) Data Import in Neo4j (up to 10M records)
  8. Neo4j + Python interactions Data Import in Neo4j (up to

    10M records) LOAD CSV WITH HEADERS from good_file_to_import_in_neo4j.csv as row
 CREATE (a:Person {name:row.name, twitter:row.twitter})
 CREATE (b:City {name:row.cityname})
 CREATE (a)-[:LOVES]->(b) name,twitter,cityname
 fabio,@fblamanna,firenze
 … Cypher’s LOAD CSV
  9. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) import pandas as pd df = pd.read_csv(nodes_bulk_loader.csv) # Handle NaNs
 df.fillna() # Format headers for Nodes df.columns = [<column 0>:ID,<property 0>:int, …,<property n>:float,:LABEL] # Save df.to_csv(nodes_bulk_loader.csv)
  10. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) import pandas as pd df = pd.read_csv(rels_bulk_loader.csv) # Handle NaNs
 df.fillna() # Format headers for Relationships df.columns = [<column 0>:START_ID,<column n>:END_ID,…,:TYPE] # Save df.to_csv(rels_bulk_loader.csv)
  11. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) :ID(Person-ID),name,twitter,:LABEL
 01,Fabio,@fblamanna,Person
 … person_nodes_bulk_loader.csv rels_bulk_loader.csv :START_ID(Person-ID),:END_ID(City-ID),:TYPE
 01,01,LOVES
 … :ID(City-ID),cityname,:LABEL
 01,firenze,City
 … city_nodes_bulk_loader.csv
  12. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) neo4j_home$ bin/neo4j-admin import --nodes person_nodes_bulk_loader.csv --nodes city_nodes_bulk_loader.csv --relationships rels_bulk_loader.csv

  13. Neo4j + Python interactions py2neo Python 2.7-3.x Neo4j 3.x $

    pip install py2neo Toolkit to work with Neo4j within Python Supports pandas dataframe natively
  14. Neo4j + Python interactions py2neo >>> from py2neo import Node,

    Relationship >>> a = Node("Person", name="Fabio") >>> b = Node("City", cityname="Firenze") >>> c = Node("Event", name="PyCon9") >>> ab = Relationship(a, "LOVES", b) >>> ab (Fabio)-[:LOVES]->(Firenze) >>> ac = Relationship(a, "TALKS_AT", c) >>> ac (Fabio)-[:TALKS_AT]->(PyCon9)
  15. Neo4j + Python interactions py2neo - Subgraphs # Union >>>

    s = ab | ac {(fabio:Person {name:"Fabio"}), (firenze:City {name:"Firenze"}), (pycon9:Event {name:"PyCon9"}), (Fabio)-[:LOVES]->(Firenze), (Fabio)-[:TALKS_AT]->(PyCon9)} # Intersection >>> s = ab & ac {(fabio:Person {name:"Fabio"})}
  16. Neo4j + Python interactions py2neo - Walkable Type >>> w

    = ab + Relationship(b, "LOVES", c) (Fabio)-[:LOVES]->(Firenze)-[:LOVES]->(PyCon9)
  17. Neo4j + Python interactions py2neo + DataFrame >>> from py2neo

    import Graph >>> from pandas import DataFrame >>> graph = Graph(password="pycon") >>> graph.data("MATCH (a:Person) RETURN a.name, a.twitter") [{'a.twitter': ‘@fblamanna', 'a.name': ‘Fabio’}] >>> DataFrame("MATCH (a:Person) RETURN a.name, a.twitter") a.name a.twitter 0 Fabio @fblamanna
  18. Neo4j + Python interactions Pypher - Cypher, but in Python

    from pypher import Pypher q = Pypher() q.Match.node("a", labels="Person").WHERE.a.property("twitter") == "@fblamanna" q.RETURN.a MATCH (a:Person)
 WHERE a.twitter = "@fblamanna" RETURN a
  19. Neo4j + Python interactions From Python to Cypher, and back

    from pypher import Pypher q = Pypher() p.Match.node('a').relationship('r').node('b').RETURN('a', 'b', 'r') >>> print(p) Cypher:
 MATCH ('a')-['r']-('b') RETURN a, b, r
  20. Summing up How to interact with Neo4j rightly within Python

    Data preparation I/O operations Official driver + Packages (py2neo, Pypher and more to come…) Discover the potential of Neo4j with your favourite snake!