Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Neo4j Database and Platform Overview

Neo4j Database and Platform Overview

This is a high-level presentation on why Neo4j was built and how it is designed. It covers the key components that set Neo4j apart in the general database market, as well as specific graph market. We include latest release notes and improvements in areas of performance, data types and indexes, and clustering options.
The presentation then covers how the core database is just one aspect of Neo4j's capabilities and what the company is doing to address needs in other areas of business, including expansion of product offerings and functionality.

Jennifer Reif

July 12, 2018
Tweet

More Decks by Jennifer Reif

Other Decks in Technology

Transcript

  1. Data Management in 1979 Paper Forms Tiny RAM Spinning Platters

    (Low Capacity / Slow, Sequential IO) RDBMS Relational Model The RDBMS Era
  2. Data Management Today Dynamic Real-World Systems Abundant RAM Flash &

    IO Co-Processors (High-Capacity Storage & Ultra-Fast Random I/O) A New Graph Era Emerging Graph Database Property Graph Model Real-Time Connected Data
  3. Illustration by David Somerville based on the original by Hugh

    McLeod (@gapingvoid) RDBMS & Aggregate- Oriented NoSQL Hadoop / MapReduce |<———————- Graph Database & ———————>| Graph Compute Engine An IT Portfolio View of Data Technologies

  4. Real-Time Storage & Retrieval RDBMS & NoSQL Databases Store &

    Retrieve Hadoop/Spark 
 Aggregates & Filters Long-Running Queries Aggregation & Filtering Neo4j Reveals Connections Real-Time
 Connected Insights Recap:
  5. 8 A unified view for ultimate agility • Easily understood

    • Easily evolved • Easy collaboration between business and IT Project Agility
 The Whiteboard Model Is the Physical Model
  6. Key Architecture Components 1 Index-Free Adjacency In memory and on

    flash/ disk 2 vs ACID Foundation Required for safe writes 3 Full-Stack Clustering Causal consistency 5 Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime, … 6 Hardware Optimizations For next-gen infrastructure Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety 4
  7. At Write Time: data is connected as it is stored

    At Read Time: Lightning-fast retrieval of data and relationships via pointer chasing Index-Free Adjacency:
  8. Connectedness and Size of Data Set Response Time Relational and

    Other NoSQL Databases 0 to 2 hops 0 to 3 degrees Thousands of connections 1000x Advantage Tens to hundreds of hops Thousands of degrees Billions of connections Neo4j “Minutes to milliseconds” “Minutes to Milliseconds” Real-Time Query Performance
  9. 12 Cypher Query Language Example HR Query in SQL The

    Same Query using Cypher MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, 
 count(report) AS Total Project Impact • Less time writing queries • Less time debugging queries • Code that’s easier to read Find all direct reports and how many people they manage, up to three levels down
  10. ACID Consistency Non-ACID Graph DBMSs 13 Maintains Integrity Over Time

    Guaranteed Graph Consistency Becomes Corrupt Over Time Not Good Enough for Graphs ACID Graph Writes : Required for Graph Transactions
  11. Neo4j 3.4 Release Highlights Performance Improvements Data Types Enterprise Scaling,

    Admin, & More Date/Time Geospatial Native String Index Fast Backups Resumable Data Import Multi-clustering Faster Cypher Runtime Rolling Upgrades
  12. Data Types - Date/Time & Spatial • Date/Time new use

    cases: • Time trees • Change logs • Temporal incentives (coupons expirations) • Complements to spatial queries • Spatial use cases: • Location searches • Logistics/area planning • Mapping
  13. Performance Improvements • Native String Index - writes up to

    5x faster! • Bulk imports handle > 100 billion nodes and relationships • Fast backups - 2x faster • Also - • Kernel API streamlines internal instructions • Transaction states consume less memory
  14. Enterprise Improvements • Cypher runtime up to 70% faster! •

    Multi-clustering - partition graph into independent parts • Automatic cache pre-warming • Rolling upgrades • Resumable copy/restore • Diagnostic metrics • Property blacklisting
  15. Finds the optimal path or evaluates route availability and quality

    Evaluates how a graph is clustered or partitioned Determines the importance of distinct nodes in the network Neo4j Graph Algorithm Library
  16. Neo4j Graph Algorithm Library - latest updates • Use new

    3.4 kernel API • Doc changes (new location!) • Yens k shortest path algorithm • Find alternative shortest paths from A to B • Shortest path, then list all other shortest paths • A* algorithm • Find shortest paths between single pairs of locations • Lowest cost route for each pair
  17. Knowledge Graphs Provide Rich 
 Context for AI AI Visibility

    Human-Friendly 
 Graph Visualization Graph-Enhanced AI Models Learning Faster, More Accurate Development Graph–Enhanced AI Models Execution Operationalize Real-Time OLAP and Monitoring Graph Analytics Enrich AI Inputs with 
 Graph Algorithms Graph System of Record Maintain a Source of 
 Connected AI Truth Graph-boosted Artificial Intelligence
  18. From Disparate Silos To Cross-Silo Connections From Tabular Data To

    Connected Data From Data Lake Analytics to Real-Time Operations Common Integration Patterns Inside the Enterprise
  19. The Neo4j Graph Platform Vision AI Graph
 Transactions Graph
 Analytics

    Data Integration Drivers & APIs APPLICATIONS DEVELOPERS
  20. Drivers & APIs • Native Language Drivers • Java •

    C# (any .Net language) • Python • JavaScript • more to come… • Massive Community Support • Go • Ruby • R • Perl • Clojure • C/C++ • Partners: • GraphAware - PHP Client • Larus - JDBC Driver
  21. Extension Libraries • APOC • ETL Tool • GraphQL •

    ElasticSearch • Versioned graphs • ML/AI (GraphAware) • GRAND stack
  22. The Neo4j Graph Platform Vision AI Graph
 Transactions Graph
 Analytics

    Data Integration Drivers & APIs APPLICATIONS DEVELOPERS Discovery & Visualization DATA ANALYSTS BUSINESS USERS
  23. The Neo4j Graph Platform Vision AI Graph
 Transactions Graph
 Analytics

    Data Integration Drivers & APIs APPLICATIONS DEVELOPERS Discovery & Visualization DATA ANALYSTS BUSINESS USERS Development & Administration ADMINS
  24. The Neo4j Graph Platform Vision AI Graph
 Transactions Graph
 Analytics

    Data Integration Drivers & APIs APPLICATIONS DEVELOPERS Discovery & Visualization DATA ANALYSTS BUSINESS USERS Development & Administration ADMINS Analytics Tooling DATA SCIENTISTS
  25. Better Access to Technology Partner Software New Neo4j-Provided Capabilities •

    Improved visibility, provisioning, and integration of partners’ software • Products and add-ons that satisfy basic Neo4j project needs across the software lifecycle Neo4j Graph Platform Means:
  26. 40