Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Adopting Neo4j @ Enterprise scale

Adopting Neo4j @ Enterprise scale

Cf0dfe4a39a8b46ad220d11ff73b9e6c?s=128

Dmitrijs Vrublevskis

October 14, 2016
Tweet

Transcript

  1. Adopting Neo4j @ Enterprise scale

  2. Dmitry Vrublevsky Software developer @ ƀ dmitry@vrublevsky.me @FylmTM Ambassador @

  3. Agenda 1. Why graph databases? 2. Why Neo4j? 3. Neo4j

    internals 4. Use cases
  4. GRAPH

  5. NOT GRAPH

  6. Graphs 101 Circle - node Arrow - relationship

  7. Proof-of-Concept Evaluate Neo4j Graph Database as replacement to existing RDBMS

    solution.
  8. Small dataset Medium dataset Large dataset

  9. Why graph databases? Domain Use case

  10. Telecommunication domain

  11. Use case: Validate network 5+ is OK

  12. Use case: Validate network 5+ is OK

  13. Use case: Validate network :( 5+ is OK

  14. Why Neo4j? Highly scalable native graph database that leverages data

    relationships as 
 first-class entities. by Neo Technology, Inc.
  15. http://db-engines.com/en/ranking

  16. Features Native Processing & Storage ACID Cypher - Graph Query

    Language REST & Native API Optional schema Lock Manager High-performance cache Clustering Backups Monitoring Community Enterprise
  17. First-class Everything is an entity Entities have properties Entities have

    a type
  18. First-class {details: —} :LIKES :DMITRY :HighLoadStrategy {works_with: Neo4j} {day: 14.10.2016}

    Properties Labels Type
  19. Neo4j internals 1. Native storage 2. Native processing

  20. Native storage Specifically designed to store and manage graphs.

  21. http://neo4j.com/developer/graph-db-vs-rdbms/

  22. http://neo4j.com/developer/graph-db-vs-rdbms/

  23. http://neo4j.com/developer/graph-db-vs-rdbms/

  24. Native processing Efficient way of processing graph data since connected

    nodes physically “point” to each other a.k.a. “index-free adjacency”
  25. $ ls -1 data/databases/graph.db | column -c 100 index neostore.propertystore.db.index.id

    index.db neostore.propertystore.db.index.keys messages.log neostore.propertystore.db.index.keys.id neostore neostore.propertystore.db.strings neostore.counts.db.a neostore.propertystore.db.strings.id neostore.counts.db.b neostore.relationshipgroupstore.db neostore.id neostore.relationshipgroupstore.db.id neostore.labeltokenstore.db neostore.relationshipstore.db neostore.labeltokenstore.db.id neostore.relationshipstore.db.id neostore.labeltokenstore.db.names neostore.relationshiptypestore.db neostore.labeltokenstore.db.names.id neostore.relationshiptypestore.db.id neostore.nodestore.db neostore.relationshiptypestore.db.names neostore.nodestore.db.id neostore.relationshiptypestore.db.names.id neostore.nodestore.db.labels neostore.schemastore.db neostore.nodestore.db.labels.id neostore.schemastore.db.id neostore.propertystore.db neostore.transaction.db.0 neostore.propertystore.db.arrays neostore.transaction.db.1 neostore.propertystore.db.arrays.id schema neostore.propertystore.db.id store_lock neostore.propertystore.db.index
  26. $ ls -1 data/databases/graph.db | column -c 100 index neostore.propertystore.db.index.id

    index.db neostore.propertystore.db.index.keys messages.log neostore.propertystore.db.index.keys.id neostore neostore.propertystore.db.strings neostore.counts.db.a neostore.propertystore.db.strings.id neostore.counts.db.b neostore.relationshipgroupstore.db neostore.id neostore.relationshipgroupstore.db.id neostore.labeltokenstore.db neostore.relationshipstore.db neostore.labeltokenstore.db.id neostore.relationshipstore.db.id neostore.labeltokenstore.db.names neostore.relationshiptypestore.db neostore.labeltokenstore.db.names.id neostore.relationshiptypestore.db.id neostore.nodestore.db neostore.relationshiptypestore.db.names neostore.nodestore.db.id neostore.relationshiptypestore.db.names.id neostore.nodestore.db.labels neostore.schemastore.db neostore.nodestore.db.labels.id neostore.schemastore.db.id neostore.propertystore.db neostore.transaction.db.0 neostore.propertystore.db.arrays neostore.transaction.db.1 neostore.propertystore.db.arrays.id schema neostore.propertystore.db.id store_lock neostore.propertystore.db.index
  27. Storage layout Node (15 bytes) in_use next_rel_id next_prop_id labels extra

    Relationship (34 bytes) directed | in_use first_node second_node rel_type first_prev_rel_id first_next_rel_id second_prev_rel_id second_next_rel_id next_prop_id first_in_chain_markers
  28. Storage layout Node (15 bytes) next_rel_id Relationship (34 bytes) first_node

    second_node first_prev_rel_id first_next_rel_id
  29. Storage math Node = RecordSize * ID Relationship = RecordSize

    * ID
  30. Traversal (Node -> Relationship) Node (15 bytes) next_rel_id=2 Relationships (34

    bytes) 2 * 34 = 68 0B 34B 68B 102B 136B 170B
  31. Traversal (Relationship -> Node) Relationship (34 bytes) Nodes (15 bytes)

    0B 15B 30B 45B 60B 75B first_node=1 second_node=4 1 * 15 = 15 4 * 15 = 60
  32. Native summary O(1) traversal hops Avoid super nodes!

  33. Cypher Cypher is a declarative graph query language that allows

    for expressive and efficient querying. https://github.com/opencypher/openCypher
  34. Cypher 101 ASCII art: ( ) - node --> -

    relationship Keywords: MATCH CREATE WHERE RETURN
  35. Cypher example (1) MATCH (root)-->(children) RETURN *

  36. Cypher example (2) MATCH (t:Towers) -[:CHILDREN]-> (n:NetworkPiece) -[:CHILDREN]-> (e:Function) WHERE

    NOT (t)-[:CHILDREN]->(:CellJCA) RETURN t
  37. Neo4j adoption

  38. Application Persistence layer Neo4j driver Neo4j Performance Fast Slow Persistence

    service
  39. Application Persistence layer Neo4j driver Neo4j Performance Fast Slow Persistence

    service
  40. Use cases Measurement average, 98% Resource usage ~ same

  41. UC: Sync Before Neo4j ~90m ~35m Count Per second Node

    count 80.32M 37498 Relationship count 80.30M 37488 Properties count 257.78M 120345
  42. UC: Single node Before Neo4j 3ms 2ms MATCH (n)
 WHERE

    n.id = {id} RETURN n
  43. UC: Subgraph Before Neo4j 88ms 14ms MATCH (n)-[r*]->(c)
 WHERE n.id

    = {id} RETURN *
  44. UC: By type Before Neo4j 235ms 194ms MATCH (t:Tower)
 RETURN

    t
  45. UC: Count Before Neo4j 32ms 16ms MATCH (n)-[r*]->(c)
 WHERE n.id

    = {id} RETURN count(*)
  46. 3 4 5 6 8 2 7 1 UC: Traversal

    Before Neo4j 112ms 39ms MATCH (n)-[r]->(c)
 WHERE n.id = {id} RETURN *
  47. Future • Real graph API for application • Rewrite manual

    traversals to Cypher queries
  48. Deployment • Implemented in Java • Works everywhere • Writes

    - vertical scaling • Reads - horizontal scaling • Extensions & Stored procedures
  49. Stability • High load on DB • Kill Slave/master •

    Rolling upgrade • Split-brain • Server power-off
  50. None