$30 off During Our Annual Pro Sale. View Details »

Introduction to Graph Databases

Introduction to Graph Databases

harjinder-hari

February 25, 2017
Tweet

More Decks by harjinder-hari

Other Decks in Technology

Transcript

  1. Introduction
    To
    Graph Databases
    Harjindersingh Mistry,
    RedHat

    View Slide

  2. Agenda
    ● What is a Graph Database ?
    ● How is it different from RDBMS ?
    ● Applications of Graph Database
    ● Popular Graph Database Systems
    ● Graph Database in AWS
    ● Demo: Graph based Recommendation

    View Slide

  3. What is a Graph Database ?
    ● A graph is an ordered pair G = ( V, E ) where
    ○ V = set of vertices or nodes and
    ○ E = set of edges, which are 2-element
    subsets of V
    ● A graph database is the system that helps us
    store and retrieve our data in graph structure
    format.
    ● It is optimized for the dataset that exhibits
    variety of relationships and these
    relationships are important for the workload.

    View Slide

  4. How is it different from RDBMS ?
    ● RDBMS also stores entities in relational tables and relationship between
    two entities. But relationships are not stored explicitly !
    ● So, costly JOIN operation is required for collecting related items !
    Customer Table
    ID Name ...
    Product Table
    ID Name ...
    Sales Table
    Cust-ID Product-ID ...

    View Slide

  5. How is it different from RDBMS ?
    ● On the other hand, graph database stores the relationships explicitly
    ● The traversal from one data element to another element is very easy and
    lightweight !
    ● No JOIN is required.

    View Slide

  6. Applications of Graph Database
    ● Google Knowledge Graph
    ○ Resultant ‘knowledge’ of processing big data
    ○ It enhances the search experience of a user
    ● Social Networks like FaceBook, Twitter,
    LinkedIn
    ○ FaceBook uses Apache Giraph
    ○ Twitter has written its own FlockDB
    ○ LinkedIn has its own Graph Database

    View Slide

  7. Applications of Graph Database
    ● Protein Matching
    ○ Useful in pharmaceutical industry for drug discovery
    ○ Structure of a protein determines its behavior
    ○ Graph structure queries are useful in protein comparison
    ● Analysis of Call Data Records ( CDR )
    ○ CDR data mainly tells who called whom
    ○ It is a single huge graph
    ○ It can be used to narrow down crime suspects

    View Slide

  8. Popular Graph Database Systems
    Titan DB Neo4J Spark GraphX
    Apache Giraph

    View Slide

  9. Graph Database in AWS
    Titan
    Gremlin Server
    DynamoDB
    Storage Backend

    View Slide

  10. Demo

    View Slide

  11. View Slide

  12. References
    ● Graph
    ○ https://en.wikipedia.org/wiki/Graph_theory#Graph
    ● Graph Database
    ○ https://en.wikipedia.org/wiki/Graph_database
    ● Graph Based Recommendation
    ○ https://developer.ibm.com/dwblog/2017/recommendation-engine-customer-insight-grap
    h-database/
    ● Google Knowledge Graph:
    ○ https://googleblog.blogspot.in/2012/05/introducing-knowledge-graph-things-not.html
    ○ https://en.wikipedia.org/wiki/Knowledge_Graph

    View Slide

  13. References
    ● LinkedIn Knowledge Graph
    ○ https://engineering.linkedin.com/blog/2016/10/building-the-linkedin-knowledge-graph
    ● Analyze CDR data to narrow down list of suspects
    ○ https://linkurio.us/how-to-use-phone-calls-and-network-analysis-to-identify-criminals/

    View Slide

  14. Images Courtesy
    ● FaceBook, Twitter, LinkedIn icons
    ○ https://goo.gl/images/xgm18q
    ● Protein
    ○ http://www.rcsb.org/pdb/images/5UGX_bio_r_500.jpg?bioNum=1
    ● Analyze CDR data to narrow down on suspects
    ○ https://goo.gl/images/6IbYVF
    ● Titan DB logo
    ○ https://goo.gl/images/0SJIPs
    ● Neo4J logo
    ○ https://goo.gl/images/dbi6IV

    View Slide

  15. Images Courtesy
    ● Apache Giraph logo
    ○ https://goo.gl/images/Vf0yLc
    ● Spark GraphX logo
    ○ http://spark.apache.org/images/spark-logo-trademark.png
    ● Amazon DynamoDB logo
    ○ https://goo.gl/images/N4SYlY
    ● Google Knowledge Graph
    ○ https://goo.gl/images/cMIQ2s
    ● Titan Data Layout
    ○ http://s3.thinkaurelius.com/docs/titan/1.0.0/images/titanstoragelayout.png

    View Slide