Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Graph Databases

Introduction to Graph Databases

harjinder-hari

February 25, 2017
Tweet

More Decks by harjinder-hari

Other Decks in Technology

Transcript

  1. Agenda • What is a Graph Database ? • How

    is it different from RDBMS ? • Applications of Graph Database • Popular Graph Database Systems • Graph Database in AWS • Demo: Graph based Recommendation
  2. What is a Graph Database ? • A graph is

    an ordered pair G = ( V, E ) where ◦ V = set of vertices or nodes and ◦ E = set of edges, which are 2-element subsets of V • A graph database is the system that helps us store and retrieve our data in graph structure format. • It is optimized for the dataset that exhibits variety of relationships and these relationships are important for the workload.
  3. How is it different from RDBMS ? • RDBMS also

    stores entities in relational tables and relationship between two entities. But relationships are not stored explicitly ! • So, costly JOIN operation is required for collecting related items ! Customer Table ID Name ... Product Table ID Name ... Sales Table Cust-ID Product-ID ...
  4. How is it different from RDBMS ? • On the

    other hand, graph database stores the relationships explicitly • The traversal from one data element to another element is very easy and lightweight ! • No JOIN is required.
  5. Applications of Graph Database • Google Knowledge Graph ◦ Resultant

    ‘knowledge’ of processing big data ◦ It enhances the search experience of a user • Social Networks like FaceBook, Twitter, LinkedIn ◦ FaceBook uses Apache Giraph ◦ Twitter has written its own FlockDB ◦ LinkedIn has its own Graph Database
  6. Applications of Graph Database • Protein Matching ◦ Useful in

    pharmaceutical industry for drug discovery ◦ Structure of a protein determines its behavior ◦ Graph structure queries are useful in protein comparison • Analysis of Call Data Records ( CDR ) ◦ CDR data mainly tells who called whom ◦ It is a single huge graph ◦ It can be used to narrow down crime suspects
  7. References • Graph ◦ https://en.wikipedia.org/wiki/Graph_theory#Graph • Graph Database ◦ https://en.wikipedia.org/wiki/Graph_database

    • Graph Based Recommendation ◦ https://developer.ibm.com/dwblog/2017/recommendation-engine-customer-insight-grap h-database/ • Google Knowledge Graph: ◦ https://googleblog.blogspot.in/2012/05/introducing-knowledge-graph-things-not.html ◦ https://en.wikipedia.org/wiki/Knowledge_Graph
  8. References • LinkedIn Knowledge Graph ◦ https://engineering.linkedin.com/blog/2016/10/building-the-linkedin-knowledge-graph • Analyze CDR

    data to narrow down list of suspects ◦ https://linkurio.us/how-to-use-phone-calls-and-network-analysis-to-identify-criminals/
  9. Images Courtesy • FaceBook, Twitter, LinkedIn icons ◦ https://goo.gl/images/xgm18q •

    Protein ◦ http://www.rcsb.org/pdb/images/5UGX_bio_r_500.jpg?bioNum=1 • Analyze CDR data to narrow down on suspects ◦ https://goo.gl/images/6IbYVF • Titan DB logo ◦ https://goo.gl/images/0SJIPs • Neo4J logo ◦ https://goo.gl/images/dbi6IV
  10. Images Courtesy • Apache Giraph logo ◦ https://goo.gl/images/Vf0yLc • Spark

    GraphX logo ◦ http://spark.apache.org/images/spark-logo-trademark.png • Amazon DynamoDB logo ◦ https://goo.gl/images/N4SYlY • Google Knowledge Graph ◦ https://goo.gl/images/cMIQ2s • Titan Data Layout ◦ http://s3.thinkaurelius.com/docs/titan/1.0.0/images/titanstoragelayout.png