Dgraph: Graph database for production environment

Slide 1

Slide 1 text

Dgraph: Graph database for production environment Manish R Jain, Dgraph Labs Gophercon India

Slide 2

Slide 2 text

Agenda 1. Introduction 2. Benchmarks 3. Design 4. Opinions on Go 5. Demo 6. Q & A

Slide 3

Slide 3 text

1. Introduction: Hi, Diggy! He's badass, and doesn't give a sh**!

Slide 4

Slide 4 text

What are graph databases Optimized for key-value lookups AND edge traversals ( joins ).

Slide 5

Slide 5 text

When should you use a graph database? Relational database with more than 5 foreign ids, better handled by graph databases. Ideal for newsfeeds, recommendation engines, context based search, pattern and fraud detection, and more.

Slide 6

Slide 6 text

What is Dgraph? Dgraph is an open source graph database built for web-scale production environments written entirely in Go. Sharded and Distributed (Distributed Joins, Filters and Sorts) Horizontally scalable. Automatic Replication Consistency (CP in CAP) via Raft Highly Available by design Fault tolerant

Slide 7

Slide 7 text

Why build it? Any company doing anything smart is using graphs. No native scalable solution.

Slide 8

Slide 8 text

2. Benchmarks

Slide 9

Slide 9 text

Neo4j data loading Dgraph is 100x faster.

Slide 10

Slide 10 text

Neo4j query Dgraph is 3x-6x faster on a read-write workload. At least as fast on read-only. Link to full benchmark code (https://github.com/dgraph-io/benchmarks/tree/master/data/neo4j)

Slide 11

Slide 11 text

Cayley (with Bolt on Macbook) Cayley is a graph layer written in Go. Loading 21M RDFs Dgraph is 9.7x faster Queries (Gremlin) Dgraph is 36.6x faster Queries (MQL) Draph is 5x faster Link to full benchmark code (https://github.com/ankurayadav/graphdb-benchmarks)

Slide 12

Slide 12 text

3. Design

Slide 13

Slide 13 text

Concepts

Slide 14

Slide 14 text

Concepts

Slide 15

Slide 15 text

Mutations

Slide 16

Slide 16 text

Atomic Consistency (aka Linearizability) Reads after a successful write are guaranteed to return that (or future) write irrespective of which replica is queried.

Slide 17

Slide 17 text

Life of a Query Aka how we solve the problem of distributed joins, distributed lters and distributed sorts e ciently.

Slide 18

Slide 18 text

Life of a Branch

Slide 19

Slide 19 text

Life of a Branch

Slide 20

Slide 20 text

Life of a Branch

Slide 21

Slide 21 text

Life of a Branch

Slide 22

Slide 22 text

Life of a Query

Slide 23

Slide 23 text

Life of a Query { # Find movies by Steven Spielberg named 'Indiana', sorted in desc order. steven(id: m.06pj8) { director.film @filter(anyof(name.en, "indiana")) (orderdesc: initial_release_date) { name.en initial_release_date } } }

Slide 24

Slide 24 text

Life of a Query: Expand Out { # Find movies by Steven Spielberg named 'Indiana', sorted in desc order. steven(id: m.06pj8) { director.film @filter(anyof(name.en, "indiana")) (orderdesc: initial_release_date) { name.en initial_release_date } } }

Slide 25

Slide 25 text

Life of a Query: Apply Filters { # Find movies by Steven Spielberg named 'Indiana', sorted in desc order. steven(id: m.06pj8) { director.film @filter(anyof(name.en, "indiana")) (orderdesc: initial_release_date) { name.en initial_release_date } } }

Slide 26

Slide 26 text

Life of a Query: Sort and Paginate { # Find movies by Steven Spielberg named 'Indiana', sorted in desc order. steven(id: m.06pj8) { director.film @filter(anyof(name.en, "indiana")) (orderdesc: initial_release_date) { name.en initial_release_date } } }

Slide 27

Slide 27 text

Life of a Query: Process Children { # Find movies by Steven Spielberg named 'Indiana', sorted in desc order. steven(id: m.06pj8) { director.film @filter(anyof(name.en, "indiana")) (orderdesc: initial_release_date) { name.en initial_release_date } } }

Slide 28

Slide 28 text

Life of a Query: ToJSON { "steven": [ { "director.film": [ { "initial_release_date": "2008-05-18", "name.en": "Indiana Jones and the Kingdom of the Crystal Skull" }, { "initial_release_date": "1989-05-24", "name.en": "Indiana Jones and the Last Crusade" }, { "initial_release_date": "1984-05-23", "name.en": "Indiana Jones and the Temple of Doom" }, { "initial_release_date": "1981-06-12", "name.en": "Indiana Jones and the Raiders of the Lost Ark" } ] } ] }

Slide 29

Slide 29 text

4. Opinions on Go

Slide 30

Slide 30 text

Things we like about Go Simplicity of code Concurrency: Goroutines and Channels Pro ling via pprof Benchmarks RPC tracing via contexts Advanced memory management using sync.Pool or Slice tricks. Gofmt Compiler

Slide 31

Slide 31 text

Things pushing us away from Go Having to use Cgo, due to lack of good library support. CJK tokenizer, equivalent of ICU Well designed key-value store

Slide 32

Slide 32 text

Removing Cgo from Dgraph entirely T raﬃc is driving me nuts. Am going to build a tunnel boring machine and just start digging... 12:05 AM - 18 Dec 2016 13,580 42,245 Elon Musk @elonmusk Follow Fed up with Cgo! Planning to write an LSM based KV store from scratch in #golang based on WiscKey paper and RocksDB. Got any advice? Ping me 11:39 AM - 26 Jan 2017 1 Manish Rai Jain @manishrjain Follow

Slide 33

Slide 33 text

5. Demo Try it out: curl https://get.dgraph.io -sSf | bash Get Started with Dgraph (https://wiki.dgraph.io/Get_Started)

Slide 34

Slide 34 text

6. Q & A Got questions? Find me and Dgraph team on breaks, or Knock at our doors. We're staying at Hyatt.

Slide 35

Slide 35 text

Thank you Manish R Jain, Dgraph Labs Gophercon India [email protected] (mailto:[email protected]) https://dgraph.io (https://dgraph.io)