Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Stefan Armbruster on Introduction into Neo4J

Stefan Armbruster on Introduction into Neo4J

More Decks by Enterprise Java User Group Austria

Other Decks in Technology

Transcript

  1. Discrete Data Minimally connected data Neo4j is designed for data

    relationships Use the Right Database for the Right Job Other NoSQL Relational DBMS Neo4j Graph DB Connected Data Focused on Data Relationships Development Benefits Easy model maintenance Easy query Deployment Benefits Ultra high performance Minimal resource usage
  2. Relational DBMSs Can’t Handle Relationships Well • Cannot model or

    store data and relationships without complexity • Performance degrades with number and levels of relationships, and database size • Query complexity grows with need for JOINs • Adding new types of data and relationships requires schema redesign, increasing time to market … making traditional databases inappropriate when data relationships are valuable in real-time Slow development Poor performance Low scalability Hard to maintain
  3. NoSQL Databases Don’t Handle Relationships • No data structures to

    model or store relationships • No query constructs to support data relationships • Relating data requires “JOIN logic” • in the application • No ACID support for transactions … making NoSQL databases inappropriate when data relationships are valuable in real-time
  4. High Business Value in Data Relationships Data is increasing in

    volume… • New digital processes • More online transactions • New social networks • More devices Using Data Relationships unlocks value • Real-time recommendations • Fraud detection • Master data management • Network and IT operations • Identity and access management • Graph-based search … and is getting more connected Customers, products, processes, devices interact and relate to each other Early adopters became industry leaders
  5. “Forrester estimates that over 25% of enterprises will be using

    graph databases by 2017” Neo4j Leads the Graph Database Revolution “Neo4j is the current market leader in graph databases.” “Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data- driven operations and decisions after the design of data capture.” IT Market Clock for Database Management Systems, 2014 https://www.gartner.com/doc/2852717/it-market-clock-database-management TechRadar™: Enterprise DBMS, Q1 2014 http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801 Graph Databases – and Their Potential to Transform How We Capture Interdependencies (Enterprise Management Associates) http://blogs.enterprisemanagement.com/dennisdrogseth/2013/11/06/graph-databasesand-potential-transform-capture-interdependencies/ IT Market Clock for Database Management Systems, 2014 https://www.gartner.com/doc/2852717/it-market-clock-database-management TechRadar™: Enterprise DBMS, Q1 2014 http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801 Graph Databases – and Their Potential to Transform How We Capture Interdependencies (Enterprise Management Associates) http://blogs.enterprisemanagement.com/dennisdrogseth/2013/11/06/graph-databasesand-potential-transform-capture-interdependencies/
  6. 2000 2003 2007 2009 2000 2003 2007 2009 2011 2011

    2013 2013 2014 2014 2015 2015 2012 2012 Neo4j: The Graph Database Leader GraphConnect, first conference for graph DBs First Global 2000 Customer Introduced first and only declarative query language for property graph Published O’Reilly book on Graph Databases $11M Series A from Fidelity, Sunstone and Conor $11M Series B from Fidelity, Sunstone and Conor Commercial Leadership Commercial Leadership First native graph DB in 24/7 production Invented property graph model Contributed first graph DB to open source $2.5M Seed Round from Sunstone and Conor Funding Funding Extended graph data model to labeled property graph 150+ customers 50K+ monthly downloads 500+ graph DB events worldwide $20M Series C led by Creandum, with Dawn and existing investors Technical Leadership Technical Leadership
  7. Largest Ecosystem of Graph Enthusiasts • 1,000,000+ downloads • 20,000+

    education registrants • 18,000+ Meetup members • 100+ technology and service partners • 200 enterprise subscription customers • including 50+ Global 2000 companies
  8. Neo4j Adoption by Selected Verticals Financial Services Communicatio ns Health

    & Life Sciences HR & Recruiting Media & Publishing Social Web Industry & Logistics Entertainment Consumer Retail Information Services Business Services
  9. How Customers Use Neo4j Network & Data Center Master Data

    Management Social Recom– mendation s Identity & Access Search & Discover y GEO
  10. Background l One of the world’s largest logistics carriers l

    Projected to outgrow capacity of old system l New parcel routing system l Single source of truth for entire network l B2C & B2B parcel tracking l Real-time routing: up to 8M parcels per day Business problem l 24x7 availability, year round l Peak loads of 3000+ parcels per second l Complex and diverse software stack l Need predictable performance & linear scalability l Daily changes to logistics network: route from any point, to any point Solution & Benefits l Neo4j provides the ideal domain fit: l a logistics network is a graph l Extreme availability & performance with Neo4j clustering l Hugely simplified queries, vs. relational for complex routing l Flexible data model can reflect real-world data variance much better than relational l “Whiteboard friendly” model easy to understand Industry: Logistics Use case: Real-time Recommendations for Routing Germany
  11. Background Business problem Solution & Benefits l German mid-size Insurance

    company l Founded in 1858 l Project executed by delvin GmbH - a 100% subsidiary of die Bayerische Versicherung a.G. and an IT service specialist in the insurance business l Field sales unit needed easy access to policies and customer data, in an increasing variety of ways l Needed to support a growing business l Existing IBM DB2 system not able to meet performance requirements as the system scaled l 24/7 available system for sales unit outside the company needed l Enable field sales unit to flexibly search for insurance policies and associated personal data, single source of truth l Raising the bar with respect to insurance industry practices l Support the business as it scales, with a high level of performance l Easy port of existing metadata into Neo4j Industry: Insurance Use case: Master Data Management Germany
  12. Neo Technology, Inc Confidential Background Business problem l In the

    drive to provide the best customer web experience on its walmart.com site, Walmart sought to use data products that connect masses of complex buyer and product data to gain super-fast insight into customer needs and product trends l Existing relational database couldn’t handle the complexity of the system’s queries Solution & Benefits l Substituted complex batch process with Neo4j for its online real-time recommendations l Built a simple, real-time recommendation system with low latency queries l Serves up better and faster recommendations, by combining historical and session data Industry: Retail Use case: Real-Time Recommendations Bentonville, Arkansas l Founded in 1962, Walmart has more than 11,000 brick and mortar stores in 27 countries l Plus more than 2 million employees and $470 billion in annual revenues l Needs to provide optimal online customer experience on its walmart.com site to compete
  13. Neo Technology, Inc Confidential Background Business problem l Enable customer-selected

    delivery inside 90min l Maintain a large network routes covering many carriers and couriers. Calculate multiple routing operations simultaneously, in real time, across all possible routes l Scale to enable a variety of services, including same-day delivery, consumer-to-consumer shipping (www.shutl.it) and more predictable delivery times Solution & Benefits l Neo4j calculates all possible routes in real time for every order l The Neo4j-based solution is thousands of times faster than the prior RDMS based solution l Queries require 10-100 times less code, improving time-to-market & code quality l Neo4j lets the team add functionality that was not previously possible Industry: Retail Use case: Routing Recommendations San Francisco & London l eBay seeks to expand global retail presence l Quick & predictable delivery is an important competitive cornerstone l To counter & upstage Amazon Prime, eBay acquired U.K.-based Shutl to form the core of a new delivery service, launching eBay Now (www.ebay.com/now) prior to Christmas 2013 l Founded in 2009, Shutl was the U.K. Leader in same-day delivery, with 70% of the market
  14. Industry: Communications Use case: Real-Time Recommendations San Jose CA l

    Cisco.com serves customer and business customers with Support Services l Needed real-time recommendations, to encourage use of online knowledge base l Cisco had been successfully using Neo4j for its internal master data management solution. l Identified a strong fit for online recommendations Solution & Benefits l Cases, solutions, articles, etc. continuously scraped for cross-reference links, and represented in Neo4j l Real-time reading recommendations via Neo4j l Neo4j Enterprise with HA cluster l The result: customers obtain help faster, with decreased reliance on customer support Background Business problem l Call center volumes needed to be lowered by improving the efficacy of online self service l Leverage large amounts of knowledge stored in service cases, solutions, articles, forums, etc. l Problem resolution times, as well as support costs, needed to be lowered Support Case Support Case Knowledge Base Article Knowledge Base Article Solution Solution Knowledge Base Article Knowledge Base Article Knowledge Base Article Knowledge Base Article Message Message Support Case Support Case
  15. Industry: Communications Use case: Network & IT Ops Paris Background

    l Second largest communications company in France l Part of Vivendi Group, partnering with Vodafone Business problem Infrastructure maintenance took one full week to plan, because of the need to model network impacts l Needed rapid, automated “what if” analysis to ensure resilience during unplanned network outages l Identify weaknesses in the network to uncover the need for additional redundancy l Network information spread across > 30 systems, with daily changes to network infrastructure l Business needs sometimes changed very rapidly Solution & Benefits l Flexible network inventory management system, to support modeling, aggregation & troubleshooting l Single source of truth (Neo4j) representing the entire network l Dynamic system loads data from 30+ systems, and allows new applications to access network data l Modeling efforts greatly reduced because of the near 1:1 mapping between the real world and the graph l Flexible schema highly adaptable to changing business requirements Router Router Service Service DEPENDS_ON Switch Switch Switch Switch Router Router Fiber Link Fiber Link Fiber Link Fiber Link Fiber Link Fiber Link Oceanfloor Cable Oceanfloor Cable DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON LINKED LINKED LINKED DEPENDS_ON
  16. Background l One of the world’s oldest and largest banks

    l More than 100 years old and includes more than 1000 predecessor institutions l 500,000 employees and contractors l Most processing is done on UNIX. Needed to manage & visualize the approximately 50,000 UNIX servers Business problem l Improve performance on company-wide network configuration l Combine log data from Splunk into an application that plays events over a visualization of the network, detect incidents l Leverage M&A legacy systems, with no room for error Solution & Benefits l Use Neo4j to store UNIX server & network configuration companywide l Original RDBMS solution could handle only 5000 servers. Neo4j introduced for performance l New applications also were built much more rapidly using Neo4j than possible with SQL Industry: Financial Services Use case: Network & IT Operations Global Large Investment Bank
  17. Industry: Communications Use case: ID & Access Management Oslo Background

    l 10th largest Telco provider in the world, leading in the Nordics l Online self-serve system where large business admins manage employee subscriptions and plans l Mission-critical system whose availability and responsiveness is critical to customer satisfaction Business problem l Degrading relational performance. User login taking minutes while system retrieved access rights l Millions of plans, customers, admins, groups. l Highly interconnected data set w/massive joins l Nightly batch workaround solved the performance problem, but led to outdated data l Primary system was Sybase. Batch pre-compute workaround projected to reach 9 hours by 2014: longer than the nightly batch window Solution & Benefits l Moved authorization functionality from Sybase to Neo4j l Modeling the resource graph in Neo4j was straightforward, as the domain is inherently a graph l Able to retire the batch process, and move to real-time responses: measured in milliseconds l Users able to see fresh data, not yesterday’s snapshot l Customer retention risks fully mitigated l Performance, Mi->millsec, Simplicity, Understand Bus Rules, Scale Subscription Subscription Account Account Customer Customer Customer Customer SUBSCRIBED_BY CONTROLLED_BY PART_OF User User USER_ACCESS
  18. Background l Top investment bank, headquarters Switzerland l Using a

    relational database coupled with Gemfire for managing employee permissions to research resources (documents and application services) Business problem l When a new investment manager was onboarded, permissions were manually provisioned via a complex manual process. Traders lost an average of 7 days of trading, waiting for the permissions to be granted l Competitor had implemented a project to accelerate the onboarding process. Needed to respond quickly. l High stakes: Regulations leave no room for error. l High complexity: Granular permissions mean each trader needed access to hundreds of resources. Solution & Benefits l Organizational model, groups, and entitlements stored in Neo4j l Meets & exceeds performance requirements. l Significant productivity advantage due to domain fit l Graph visualization makes it easier for the business to provision permissions themselves l Moving to Neo4j meant “fewer compromises” than a relational data store l Now using Neo4j for authorization behind online brokerage business Industry: Financial Services Use case: ID & Access Management London Large Investment Bank
  19. Background l The global cost of fraud and identity theft

    is estimated to be over $200 billion per year l Global financial services firm: trillions of dollars in total assets l Varying compliance & governance considerations l Incredibly complex transaction systems, with ever-growing opportunities for fraud Business problem l Needed to spot and prevent fraud detection in real time, especially in payments that fall within “normal” behavior metrics l Needed more accurate and faster credit risk analysis for payment transactions l Needed to dramatically reduce chargebacks Solution & Benefits l Neo4j helped them simplify both the credit risk analysis and fraud detection processes, lowering TCO l Uniquely identify entities and connections l Chargebacks and fraud greatly reduced, huge savings l Empower business-unit teams to build Neo4j applications for real- time use, and easily evolve them to include non-uniform data, avoiding sparse tables and frequent schema changes Industry: Financial Services Use case: Fraud Detection London & New York Large Financial Services Co.
  20. Background Business problem Solution & Benefits l Tre is part

    of Hutchison Whampoa, one of the world’s largest telecommunications conglomerates l Operates in the Nordics and U.K. l A Neo4j cluster, containing a graph of customer billing information, is accessed by customer-facing applications l Neo4j’s graph-based model enables timely & insightful profiling of customers to support customer service l New applications & enhancements are developed faster l Queries running much faster thanks to Neo4j Industry: Telecommunications Use case: Master Data Management (Customer Data) Stockholm, Schweden l New business requirement to give customers more insight into their own usage patterns l Changing the data model was slow and painful l New queries were difficult to write l Very large data sets creating serious performance problems in RDBMS for connected queries (>L2) l Tre saw value in moving towards real-time customer profiling and real-time analytics
  21. M ATCH (boss) - [ : M ANAG ES*0. .

    3] - > ( sub) , ( sub) - [ : M ANAG ES*1. . 3] - > ( r eport ) W H ER E boss. nam e = “J ohn D oe” R ETU R N sub. nam e A S Subor di nat e, count ( r epor t ) A S Tot al Express Complex Queries Easily with Cypher Find all direct reports and how many people they manage, up to 3 levels down Cypher Query SQL Query
  22. Background l World’s largest provider of IT infrastructure, software &

    services l HP’s Unified Correlation Analyzer (UCA) application is a key application inside HP’s OSS Assurance portfolio l Carrier-class resource & service management, problem determination, root cause & service impact analysis l Helps communications operators manage large, complex and fast changing networks Business problem l Use network topology information to identify root problems causes on the network l Simplify alarm handling by human operators l Automate handling of certain types of alarms Help operators respond rapidly to network issues l Filter/group/eliminate redundant Network Management System alarms by event correlation Solution & Benefits l Accelerated product development time l Extremely fast querying of network topology l Graph representation a perfect domain fit l 24x7 carrier-grade reliability with Neo4j HA clustering l Met objective in under 6 months Industry: Web/ISV, Communications Use case: Network & IT Ops Global (U.S., France)