Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Stefan Armbruster on Introduction into Neo4J

Stefan Armbruster on Introduction into Neo4J

More Decks by Enterprise Java User Group Austria

Other Decks in Technology

Transcript

  1. Neo4j: Konzepte, Anwendungsfälle
    und Live-Demo
    e: [email protected]
    t: darthvader42

    View Slide

  2. Beispiel: Logisches Modell Logistikprozess

    View Slide

  3. Relationales Schema
    (“die Welt in Tabellen pressen”)
    :

    View Slide

  4. Graphenmodell

    View Slide

  5. The Whiteboard Model Is the Physical Model

    View Slide

  6. An intuitive approach to data problems

    View Slide

  7. (graphs)-[:ARE]->(everywhere)

    View Slide

  8. Discrete Data
    Minimally
    connected data
    Neo4j is designed for data relationships
    Use the Right Database for the Right Job
    Other NoSQL
    Relational
    DBMS
    Neo4j Graph DB
    Connected Data
    Focused on
    Data Relationships
    Development Benefits
    Easy model maintenance
    Easy query
    Deployment Benefits
    Ultra high performance
    Minimal resource usage

    View Slide

  9. Relational DBMSs Can’t Handle Relationships Well
    • Cannot model or store data and
    relationships without complexity
    • Performance degrades with number and
    levels of relationships, and database size
    • Query complexity grows with need for JOINs
    • Adding new types of data and
    relationships requires schema redesign,
    increasing time to market
    … making traditional databases inappropriate
    when data relationships are valuable in real-time
    Slow development
    Poor performance
    Low scalability
    Hard to maintain

    View Slide

  10. NoSQL Databases Don’t Handle Relationships
    • No data structures to model or store
    relationships
    • No query constructs to support data
    relationships
    • Relating data requires “JOIN logic”
    • in the application
    • No ACID support for transactions
    … making NoSQL databases inappropriate when
    data relationships are valuable in real-time

    View Slide

  11. High Business Value in Data Relationships
    Data is increasing in volume…
    • New digital processes
    • More online transactions
    • New social networks
    • More devices
    Using Data Relationships unlocks value
    • Real-time recommendations
    • Fraud detection
    • Master data management
    • Network and IT operations
    • Identity and access management
    • Graph-based search
    … and is getting more
    connected
    Customers, products, processes,
    devices interact and relate to each
    other
    Early adopters became industry leaders

    View Slide

  12. “Forrester estimates that over 25% of enterprises will be
    using graph databases by 2017”
    Neo4j Leads the Graph Database Revolution
    “Neo4j is the current market leader in graph databases.”
    “Graph analysis is possibly the single most effective
    competitive differentiator for organizations pursuing data-
    driven operations and decisions after the design of data
    capture.”
    IT Market Clock for Database Management Systems, 2014
    https://www.gartner.com/doc/2852717/it-market-clock-database-management
    TechRadar™: Enterprise DBMS, Q1 2014
    http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801
    Graph Databases – and Their Potential to Transform How We Capture Interdependencies (Enterprise Management Associates)
    http://blogs.enterprisemanagement.com/dennisdrogseth/2013/11/06/graph-databasesand-potential-transform-capture-interdependencies/
    IT Market Clock for Database Management Systems, 2014
    https://www.gartner.com/doc/2852717/it-market-clock-database-management
    TechRadar™: Enterprise DBMS, Q1 2014
    http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801
    Graph Databases – and Their Potential to Transform How We Capture Interdependencies (Enterprise Management Associates)
    http://blogs.enterprisemanagement.com/dennisdrogseth/2013/11/06/graph-databasesand-potential-transform-capture-interdependencies/

    View Slide

  13. 2012  2015

    View Slide

  14. 2000 2003 2007 2009
    2000 2003 2007 2009 2011
    2011 2013
    2013 2014
    2014 2015
    2015
    2012
    2012
    Neo4j: The Graph Database Leader
    GraphConnect,
    first conference
    for graph DBs
    First
    Global 2000
    Customer
    Introduced
    first and only
    declarative query
    language for
    property graph
    Published
    O’Reilly
    book
    on Graph
    Databases
    $11M Series A
    from Fidelity,
    Sunstone
    and Conor
    $11M Series B
    from Fidelity,
    Sunstone
    and Conor
    Commercial
    Leadership
    Commercial
    Leadership
    First
    native
    graph DB
    in 24/7
    production
    Invented
    property
    graph
    model
    Contributed
    first graph
    DB to open
    source
    $2.5M Seed
    Round from
    Sunstone
    and Conor
    Funding
    Funding
    Extended
    graph data
    model to
    labeled
    property graph
    150+ customers
    50K+ monthly
    downloads
    500+ graph
    DB events
    worldwide $20M Series C
    led by Creandum,
    with Dawn and
    existing investors
    Technical
    Leadership
    Technical
    Leadership

    View Slide

  15. Largest Ecosystem of Graph Enthusiasts
    • 1,000,000+ downloads
    • 20,000+ education registrants
    • 18,000+ Meetup members
    • 100+ technology and service partners
    • 200 enterprise subscription customers
    • including 50+ Global 2000 companies

    View Slide

  16. Neo4j Adoption by Selected Verticals
    Financial
    Services
    Communicatio
    ns
    Health &
    Life
    Sciences
    HR &
    Recruiting
    Media &
    Publishing
    Social
    Web
    Industry
    & Logistics
    Entertainment Consumer Retail Information
    Services
    Business Services

    View Slide

  17. How Customers Use Neo4j
    Network &
    Data
    Center
    Master Data
    Management
    Social
    Recom–
    mendation
    s
    Identity
    &
    Access
    Search &
    Discover
    y
    GEO

    View Slide

  18. Background
    l One of the world’s largest logistics carriers
    l Projected to outgrow capacity of old system
    l New parcel routing system
    l Single source of truth for entire network
    l B2C & B2B parcel tracking
    l Real-time routing: up to 8M parcels per day
    Business problem
    l 24x7 availability, year round
    l Peak loads of 3000+ parcels per second
    l Complex and diverse software stack
    l Need predictable performance & linear scalability
    l Daily changes to logistics network: route from any point, to any
    point
    Solution & Benefits
    l Neo4j provides the ideal domain fit:
    l a logistics network is a graph
    l Extreme availability & performance with Neo4j clustering
    l Hugely simplified queries, vs. relational for complex routing
    l Flexible data model can reflect real-world data variance much better than
    relational
    l “Whiteboard friendly” model easy to understand
    Industry: Logistics
    Use case: Real-time Recommendations for Routing
    Germany

    View Slide

  19. Adidas: Shared Metadata Service

    View Slide

  20. Adidas: Shared Metadata Service

    View Slide

  21. Lufthansa: Content/Digital Asset Management

    View Slide

  22. Background
    Business problem Solution & Benefits
    l German mid-size Insurance company
    l Founded in 1858
    l Project executed by delvin GmbH - a 100% subsidiary of die
    Bayerische Versicherung a.G. and an IT service specialist in the
    insurance business
    l Field sales unit needed easy access to policies and customer
    data, in an increasing variety of ways
    l Needed to support a growing business
    l Existing IBM DB2 system not able to meet performance
    requirements as the system scaled
    l 24/7 available system for sales unit outside the company
    needed
    l Enable field sales unit to flexibly search for insurance policies
    and associated personal data, single source of truth
    l Raising the bar with respect to insurance industry practices
    l Support the business as it scales, with a high level of
    performance
    l Easy port of existing metadata into Neo4j
    Industry: Insurance
    Use case: Master Data Management
    Germany

    View Slide

  23. Neo Technology, Inc Confidential
    Background
    Business problem
    l In the drive to provide the best customer web experience on
    its walmart.com site, Walmart sought to use data products
    that connect masses of complex buyer and product data to
    gain super-fast insight into customer needs and product trends
    l Existing relational database couldn’t handle the complexity of
    the system’s queries
    Solution & Benefits
    l Substituted complex batch process with Neo4j for its online real-time
    recommendations
    l Built a simple, real-time recommendation system with low latency queries
    l Serves up better and faster recommendations, by combining historical
    and session data
    Industry: Retail
    Use case: Real-Time Recommendations
    Bentonville, Arkansas
    l Founded in 1962, Walmart has more than 11,000 brick and
    mortar stores in 27 countries
    l Plus more than 2 million employees and $470 billion in annual
    revenues
    l Needs to provide optimal online customer experience on its
    walmart.com site to compete

    View Slide

  24. Neo Technology, Inc Confidential
    Background
    Business problem
    l Enable customer-selected delivery inside 90min
    l Maintain a large network routes covering many carriers and
    couriers. Calculate multiple routing operations simultaneously, in
    real time, across all possible routes
    l Scale to enable a variety of services, including same-day delivery,
    consumer-to-consumer shipping (www.shutl.it) and more
    predictable delivery times
    Solution & Benefits
    l Neo4j calculates all possible routes in real time for every order
    l The Neo4j-based solution is thousands of times faster than the prior
    RDMS based solution
    l Queries require 10-100 times less code, improving time-to-market &
    code quality
    l Neo4j lets the team add functionality that was not previously possible
    Industry: Retail
    Use case: Routing Recommendations
    San Francisco & London
    l eBay seeks to expand global retail presence
    l Quick & predictable delivery is an important competitive
    cornerstone
    l To counter & upstage Amazon Prime, eBay acquired U.K.-based
    Shutl to form the core of a new delivery service, launching eBay
    Now (www.ebay.com/now) prior to Christmas 2013
    l Founded in 2009, Shutl was the U.K. Leader in same-day
    delivery, with 70% of the market

    View Slide

  25. Industry: Communications
    Use case: Real-Time Recommendations
    San Jose CA
    l Cisco.com serves customer and business customers with
    Support Services
    l Needed real-time recommendations, to encourage use of online
    knowledge base
    l Cisco had been successfully using Neo4j for its internal master
    data management solution.
    l Identified a strong fit for online recommendations
    Solution & Benefits
    l Cases, solutions, articles, etc. continuously scraped for cross-reference
    links, and represented in Neo4j
    l Real-time reading recommendations via Neo4j
    l Neo4j Enterprise with HA cluster
    l The result: customers obtain help faster, with decreased reliance on
    customer support
    Background
    Business problem
    l Call center volumes needed to be lowered by improving the
    efficacy of online self service
    l Leverage large amounts of knowledge stored in service cases,
    solutions, articles, forums, etc.
    l Problem resolution times, as well as support costs, needed to be
    lowered
    Support Case
    Support Case
    Knowledge
    Base
    Article
    Knowledge
    Base
    Article
    Solution
    Solution
    Knowledge
    Base
    Article
    Knowledge
    Base
    Article
    Knowledge
    Base
    Article
    Knowledge
    Base
    Article
    Message
    Message
    Support Case
    Support Case

    View Slide

  26. Industry: Communications
    Use case: Network & IT Ops
    Paris
    Background
    l Second largest communications company in France
    l Part of Vivendi Group, partnering with Vodafone
    Business problem
    Infrastructure maintenance took one full week to plan, because of
    the need to model network impacts
    l Needed rapid, automated “what if” analysis to ensure resilience
    during unplanned network outages
    l Identify weaknesses in the network to uncover the need for
    additional redundancy
    l Network information spread across > 30 systems, with daily
    changes to network infrastructure
    l Business needs sometimes changed very rapidly
    Solution & Benefits
    l Flexible network inventory management system, to support
    modeling, aggregation & troubleshooting
    l Single source of truth (Neo4j) representing the entire network
    l Dynamic system loads data from 30+ systems, and allows new
    applications to access network data
    l Modeling efforts greatly reduced because of the near 1:1 mapping
    between the real world and the graph
    l Flexible schema highly adaptable to changing business requirements
    Router
    Router
    Service
    Service
    DEPENDS_ON
    Switch
    Switch Switch
    Switch
    Router
    Router
    Fiber Link
    Fiber Link
    Fiber Link
    Fiber Link
    Fiber Link
    Fiber Link
    Oceanfloor Cable
    Oceanfloor Cable
    DEPENDS_ON
    DEPENDS_ON
    DEPENDS_ON
    DEPENDS_ON
    DEPENDS_ON
    DEPENDS_ON
    DEPENDS_ON
    DEPENDS_ON
    DEPENDS_ON
    LINKED
    LINKED
    LINKED
    DEPENDS_ON

    View Slide

  27. Background
    l One of the world’s oldest and largest banks
    l More than 100 years old and includes more than 1000
    predecessor institutions
    l 500,000 employees and contractors
    l Most processing is done on UNIX. Needed to manage &
    visualize the approximately 50,000 UNIX servers
    Business problem
    l Improve performance on company-wide network configuration
    l Combine log data from Splunk into an application that plays
    events over a visualization of the network, detect incidents
    l Leverage M&A legacy systems, with no room for error
    Solution & Benefits
    l Use Neo4j to store UNIX server & network configuration
    companywide
    l Original RDBMS solution could handle only 5000 servers. Neo4j
    introduced for performance
    l New applications also were built much more rapidly using Neo4j
    than possible with SQL
    Industry: Financial Services
    Use case: Network & IT Operations
    Global
    Large
    Investment
    Bank

    View Slide

  28. Industry: Communications
    Use case: ID & Access Management
    Oslo
    Background
    l 10th largest Telco provider in the world, leading in the Nordics
    l Online self-serve system where large business admins manage
    employee subscriptions and plans
    l Mission-critical system whose availability and responsiveness is
    critical to customer satisfaction
    Business problem
    l Degrading relational performance. User login taking minutes while
    system retrieved access rights
    l Millions of plans, customers, admins, groups.
    l Highly interconnected data set w/massive joins
    l Nightly batch workaround solved the performance problem, but led
    to outdated data
    l Primary system was Sybase. Batch pre-compute workaround
    projected to reach 9 hours by 2014: longer than the nightly batch
    window
    Solution & Benefits
    l Moved authorization functionality from Sybase to Neo4j
    l Modeling the resource graph in Neo4j was straightforward, as the
    domain is inherently a graph
    l Able to retire the batch process, and move to real-time responses:
    measured in milliseconds
    l Users able to see fresh data, not yesterday’s snapshot
    l Customer retention risks fully mitigated
    l Performance, Mi->millsec, Simplicity, Understand Bus Rules, Scale
    Subscription
    Subscription
    Account
    Account
    Customer
    Customer
    Customer
    Customer
    SUBSCRIBED_BY
    CONTROLLED_BY
    PART_OF
    User
    User
    USER_ACCESS

    View Slide

  29. Background
    l Top investment bank, headquarters Switzerland
    l Using a relational database coupled with Gemfire for
    managing employee permissions to research resources
    (documents and application services)
    Business problem
    l When a new investment manager was onboarded, permissions
    were manually provisioned via a complex manual process.
    Traders lost an average of 7 days of trading, waiting for the
    permissions to be granted
    l Competitor had implemented a project to accelerate the
    onboarding process. Needed to respond quickly.
    l High stakes: Regulations leave no room for error.
    l High complexity: Granular permissions mean each trader
    needed access to hundreds of resources.
    Solution & Benefits
    l Organizational model, groups, and entitlements stored in Neo4j
    l Meets & exceeds performance requirements.
    l Significant productivity advantage due to domain fit
    l Graph visualization makes it easier for the business to provision
    permissions themselves
    l Moving to Neo4j meant “fewer compromises” than a relational
    data store
    l Now using Neo4j for authorization behind online brokerage
    business
    Industry: Financial Services
    Use case: ID & Access Management
    London
    Large
    Investment
    Bank

    View Slide

  30. Background
    l The global cost of fraud and identity theft is estimated to
    be over $200 billion per year
    l Global financial services firm: trillions of dollars in total assets
    l Varying compliance & governance considerations
    l Incredibly complex transaction systems, with ever-growing
    opportunities for fraud
    Business problem
    l Needed to spot and prevent fraud detection in real time,
    especially in payments that fall within “normal” behavior metrics
    l Needed more accurate and faster credit risk analysis for
    payment transactions
    l Needed to dramatically reduce chargebacks
    Solution & Benefits
    l Neo4j helped them simplify both the credit risk analysis and fraud
    detection processes, lowering TCO
    l Uniquely identify entities and connections
    l Chargebacks and fraud greatly reduced, huge savings
    l Empower business-unit teams to build Neo4j applications for real-
    time use, and easily evolve them to include non-uniform data,
    avoiding sparse tables and frequent schema changes
    Industry: Financial Services
    Use case: Fraud Detection
    London & New York
    Large Financial
    Services Co.

    View Slide

  31. Background
    Business problem Solution & Benefits
    l Tre is part of Hutchison Whampoa, one of the world’s largest
    telecommunications conglomerates
    l Operates in the Nordics and U.K.
    l A Neo4j cluster, containing a graph of customer billing information, is
    accessed by customer-facing applications
    l Neo4j’s graph-based model enables timely & insightful profiling of
    customers to support customer service
    l New applications & enhancements are developed faster
    l Queries running much faster thanks to Neo4j
    Industry: Telecommunications
    Use case: Master Data Management (Customer Data)
    Stockholm, Schweden
    l New business requirement to give customers more insight into
    their own usage patterns
    l Changing the data model was slow and painful
    l New queries were difficult to write
    l Very large data sets creating serious performance problems in
    RDBMS for connected queries (>L2)
    l Tre saw value in moving towards real-time customer profiling
    and real-time analytics

    View Slide

  32. Neo4j: technical overview

    View Slide

  33. Labeled Property Graph Model

    View Slide

  34. demo time ...

    View Slide

  35. M ATCH (boss)
    -
    [
    :
    M ANAG ES*0.
    .
    3]
    -
    > (
    sub)
    ,
    (
    sub)
    -
    [
    :
    M ANAG ES*1.
    .
    3]
    -
    > (
    r
    eport
    )
    W H ER E boss.
    nam e = “J
    ohn D oe”
    R ETU R N sub.
    nam e A S Subor
    di
    nat
    e,
    count
    (
    r
    epor
    t
    ) A S Tot
    al
    Express Complex Queries Easily with Cypher
    Find all direct reports and how
    many people they manage,
    up to 3 levels down
    Cypher Query
    SQL Query

    View Slide

  36. Background
    l World’s largest provider of IT infrastructure, software & services
    l HP’s Unified Correlation Analyzer (UCA) application is a key
    application inside HP’s OSS Assurance portfolio
    l Carrier-class resource & service management, problem
    determination, root cause & service impact analysis
    l Helps communications operators manage large, complex and fast
    changing networks
    Business problem
    l Use network topology information to identify root problems
    causes on the network
    l Simplify alarm handling by human operators
    l Automate handling of certain types of alarms Help operators
    respond rapidly to network issues
    l Filter/group/eliminate redundant Network Management
    System alarms by event correlation
    Solution & Benefits
    l Accelerated product development time
    l Extremely fast querying of network topology
    l Graph representation a perfect domain fit
    l 24x7 carrier-grade reliability with Neo4j HA clustering
    l Met objective in under 6 months
    Industry: Web/ISV, Communications
    Use case: Network & IT Ops
    Global (U.S., France)

    View Slide