Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Discovering the Power of Graph Databases with Python and Neo4j

Discovering the Power of Graph Databases with Python and Neo4j

Talk at PyCon Nove, April 20th, 2018

E49f543f10d97610b8d0ef7e1254d60f?s=128

Fabio Lamanna

April 21, 2018
Tweet

Transcript

  1. Discovering the power of graph databases with Python and Neo4j

    Fabio Lamanna
 @fblamanna April 20th, 2018
  2. Outline who am I? databases + graphs Neo4j Neo4j +

    Python Interactions
  3. who am I?

  4. from py2neo import Graph
 def sponsoringPyCon():
 graph = Graph()
 return

    graph.data
 ("MATCH (:LARUS)-[:SUPPORTS]->(:PyCon9) RETURN *")
  5. databases

  6. relational (?) databases

  7. ways to store data relational NoSQL {key:value} column document NoSQL

    (graph)
  8. graph databases relational NoSQL {key:value} column document NoSQL (graph)

  9. graph databases this is where connections 
 start to matter

    NoSQL (graph)
  10. None
  11. Neo4j Neo4j reveals connections in your data

  12. platform for connected data

  13. glue for data

  14. complex networks laws

  15. matching patterns of information

  16. is pretty fast!

  17. Cypher

  18. easy to build data models :Event name:”Fabio” twitter: @fblamanna name:”PyCon9”

    type:”Conference” name:”Firenze” :Person :City :LOVES :TALKS_AT :LO CATED_IN
  19. Cypher name:”Fabio” twitter: @fblamanna name:”Firenze” :Person :City :LOVES CREATE (:Person

    {name:”Fabio”, twitter:…})-[:LOVES]->(:City {name:”Firenze”}) NODE NODE RELATIONSHIP LABEL PROPERTIES PROPERTY LABEL TYPE
  20. some applications

  21. recommendations MATCH (you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco) 
 WHERE id(you) = “Fabio” 
 RETURN

    reco
  22. paradise papers MATCH p=(o:Officer {name: "The Duchy of Lancaster"})-[*..2]-() RETURN

    p Queen Elizabeth II’s private estate and portfolio
  23. None
  24. Neo4j + Python interactions Official Neo4j Python Bolt Driver

  25. from neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user",

    “pass”) ) with driver.session() as session: session.run( ‘CREATE (:Person {name:"Fabio"})}-[:TALKS_AT]->(:Event {name:"PyCon"})’ ) Neo4j + Python interactions Official Neo4j Python Bolt Driver
  26. Neo4j + Python interactions Official Neo4j Python Bolt Driver from

    neo4j.v1 import GraphDatabase, basic_auth driver = GraphDatabase.driver( "bolt://localhost:7687", auth=basic_auth("user", “pass”) ) query = """\ ‘MATCH (:Person {name:”Fabio”})}-[:TALKS_AT]->(event)’ """ with driver.session() as session: df = pd.DataFrame([dict(record) for record in session.run(query)])
  27. Data Import in Neo4j

  28. Neo4j + Python interactions Data Import in Neo4j

  29. Neo4j + Python interactions import pandas as pd df =

    pd.read_csv(bad_file_to_import_in_neo4j.csv) # Handle NaNs
 df.fillna(replacement) # Clean/replace bad chars in column names df.columns = df.columns.str.replace(names_with_no_punctuation) df.columns = [names_with_no_spaces] # Set Types - recommended df[‘column’].astype(‘int64’) # Save df.to_csv(good_file_to_import_in_neo4j.csv) Data Import in Neo4j (up to 10M records)
  30. Neo4j + Python interactions Data Import in Neo4j (up to

    10M records) LOAD CSV WITH HEADERS from good_file_to_import_in_neo4j.csv as row
 CREATE (a:Person {name:row.name, twitter:row.twitter})
 CREATE (b:City {name:row.cityname})
 CREATE (a)-[:LOVES]->(b) name,twitter,cityname
 fabio,@fblamanna,firenze
 … Cypher’s LOAD CSV
  31. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) import pandas as pd df = pd.read_csv(nodes_bulk_loader.csv) # Handle NaNs
 df.fillna() # Format headers for Nodes df.columns = [<column 0>:ID,<property 0>:int, …,<property n>:float,:LABEL] # Save df.to_csv(nodes_bulk_loader.csv)
  32. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) import pandas as pd df = pd.read_csv(rels_bulk_loader.csv) # Handle NaNs
 df.fillna() # Format headers for Relationships df.columns = [<column 0>:START_ID,<column n>:END_ID,…,:TYPE] # Save df.to_csv(rels_bulk_loader.csv)
  33. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) :ID(Person-ID),name,twitter,:LABEL
 01,Fabio,@fblamanna,Person
 … person_nodes_bulk_loader.csv rels_bulk_loader.csv :START_ID(Person-ID),:END_ID(City-ID),:TYPE
 01,01,LOVES
 … :ID(City-ID),cityname,:LABEL
 01,firenze,City
 … city_nodes_bulk_loader.csv
  34. Neo4j + Python interactions Data Import in Neo4j (range 100B

    records) neo4j_home$ bin/neo4j-admin import --nodes person_nodes_bulk_loader.csv --nodes city_nodes_bulk_loader.csv --relationships rels_bulk_loader.csv

  35. py2neo

  36. Neo4j + Python interactions py2neo Python 2.7-3.x Neo4j 3.x $

    pip install py2neo Toolkit to work with Neo4j within Python Supports pandas dataframe natively
  37. Neo4j + Python interactions py2neo >>> from py2neo import Node,

    Relationship >>> a = Node("Person", name="Fabio") >>> b = Node("City", cityname="Firenze") >>> c = Node("Event", name="PyCon9") >>> ab = Relationship(a, "LOVES", b) >>> ab (Fabio)-[:LOVES]->(Firenze) >>> ac = Relationship(a, "TALKS_AT", c) >>> ac (Fabio)-[:TALKS_AT]->(PyCon9)
  38. Neo4j + Python interactions py2neo - Subgraphs # Union >>>

    s = ab | ac {(fabio:Person {name:"Fabio"}), (firenze:City {name:"Firenze"}), (pycon9:Event {name:"PyCon9"}), (Fabio)-[:LOVES]->(Firenze), (Fabio)-[:TALKS_AT]->(PyCon9)} # Intersection >>> s = ab & ac {(fabio:Person {name:"Fabio"})}
  39. Neo4j + Python interactions py2neo - Walkable Type >>> w

    = ab + Relationship(b, "LOVES", c) (Fabio)-[:LOVES]->(Firenze)-[:LOVES]->(PyCon9)
  40. Neo4j + Python interactions py2neo + DataFrame >>> from py2neo

    import Graph >>> from pandas import DataFrame >>> graph = Graph(password="pycon") >>> graph.data("MATCH (a:Person) RETURN a.name, a.twitter") [{'a.twitter': ‘@fblamanna', 'a.name': ‘Fabio’}] >>> DataFrame("MATCH (a:Person) RETURN a.name, a.twitter") a.name a.twitter 0 Fabio @fblamanna
  41. Pypher

  42. Neo4j + Python interactions Pypher Express Cypher queries in pure

    Python! $ pip install python_cypher
  43. Neo4j + Python interactions Pypher - Cypher, but in Python

    from pypher import Pypher q = Pypher() q.Match.node("a", labels="Person").WHERE.a.property("twitter") == "@fblamanna" q.RETURN.a MATCH (a:Person)
 WHERE a.twitter = "@fblamanna" RETURN a
  44. Neo4j + Python interactions From Python to Cypher, and back

    from pypher import Pypher q = Pypher() p.Match.node('a').relationship('r').node('b').RETURN('a', 'b', 'r') >>> print(p) Cypher:
 MATCH ('a')-['r']-('b') RETURN a, b, r
  45. Summing up How to interact with Neo4j rightly within Python

    Data preparation I/O operations Official driver + Packages (py2neo, Pypher and more to come…) Discover the potential of Neo4j with your favourite snake!
  46. (fabio)-[:THANKS]->(PyCon9) Fabio Lamanna
 @fblamanna