Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Graph Databases

Introduction to Graph Databases

harjinder-hari

February 25, 2017
Tweet

More Decks by harjinder-hari

Other Decks in Technology

Transcript

  1. Introduction To Graph Databases Harjindersingh Mistry, RedHat

  2. Agenda • What is a Graph Database ? • How

    is it different from RDBMS ? • Applications of Graph Database • Popular Graph Database Systems • Graph Database in AWS • Demo: Graph based Recommendation
  3. What is a Graph Database ? • A graph is

    an ordered pair G = ( V, E ) where ◦ V = set of vertices or nodes and ◦ E = set of edges, which are 2-element subsets of V • A graph database is the system that helps us store and retrieve our data in graph structure format. • It is optimized for the dataset that exhibits variety of relationships and these relationships are important for the workload.
  4. How is it different from RDBMS ? • RDBMS also

    stores entities in relational tables and relationship between two entities. But relationships are not stored explicitly ! • So, costly JOIN operation is required for collecting related items ! Customer Table ID Name ... Product Table ID Name ... Sales Table Cust-ID Product-ID ...
  5. How is it different from RDBMS ? • On the

    other hand, graph database stores the relationships explicitly • The traversal from one data element to another element is very easy and lightweight ! • No JOIN is required.
  6. Applications of Graph Database • Google Knowledge Graph ◦ Resultant

    ‘knowledge’ of processing big data ◦ It enhances the search experience of a user • Social Networks like FaceBook, Twitter, LinkedIn ◦ FaceBook uses Apache Giraph ◦ Twitter has written its own FlockDB ◦ LinkedIn has its own Graph Database
  7. Applications of Graph Database • Protein Matching ◦ Useful in

    pharmaceutical industry for drug discovery ◦ Structure of a protein determines its behavior ◦ Graph structure queries are useful in protein comparison • Analysis of Call Data Records ( CDR ) ◦ CDR data mainly tells who called whom ◦ It is a single huge graph ◦ It can be used to narrow down crime suspects
  8. Popular Graph Database Systems Titan DB Neo4J Spark GraphX Apache

    Giraph
  9. Graph Database in AWS Titan Gremlin Server DynamoDB Storage Backend

  10. Demo

  11. Thanks! hmistry@redhat.com

  12. References • Graph ◦ https://en.wikipedia.org/wiki/Graph_theory#Graph • Graph Database ◦ https://en.wikipedia.org/wiki/Graph_database

    • Graph Based Recommendation ◦ https://developer.ibm.com/dwblog/2017/recommendation-engine-customer-insight-grap h-database/ • Google Knowledge Graph: ◦ https://googleblog.blogspot.in/2012/05/introducing-knowledge-graph-things-not.html ◦ https://en.wikipedia.org/wiki/Knowledge_Graph
  13. References • LinkedIn Knowledge Graph ◦ https://engineering.linkedin.com/blog/2016/10/building-the-linkedin-knowledge-graph • Analyze CDR

    data to narrow down list of suspects ◦ https://linkurio.us/how-to-use-phone-calls-and-network-analysis-to-identify-criminals/
  14. Images Courtesy • FaceBook, Twitter, LinkedIn icons ◦ https://goo.gl/images/xgm18q •

    Protein ◦ http://www.rcsb.org/pdb/images/5UGX_bio_r_500.jpg?bioNum=1 • Analyze CDR data to narrow down on suspects ◦ https://goo.gl/images/6IbYVF • Titan DB logo ◦ https://goo.gl/images/0SJIPs • Neo4J logo ◦ https://goo.gl/images/dbi6IV
  15. Images Courtesy • Apache Giraph logo ◦ https://goo.gl/images/Vf0yLc • Spark

    GraphX logo ◦ http://spark.apache.org/images/spark-logo-trademark.png • Amazon DynamoDB logo ◦ https://goo.gl/images/N4SYlY • Google Knowledge Graph ◦ https://goo.gl/images/cMIQ2s • Titan Data Layout ◦ http://s3.thinkaurelius.com/docs/titan/1.0.0/images/titanstoragelayout.png