Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dgraph: Fast graph database written in Go

Dgraph: Fast graph database written in Go

The talk was given at Gophercon Singapore.

Link to video: https://www.youtube.com/watch?v=cHXbYLNa0qQ

Manish R Jain

May 26, 2017
Tweet

More Decks by Manish R Jain

Other Decks in Technology

Transcript

  1. Dgraph: Fast graph database written
    in Go
    Manish R Jain
    Dgraph Labs

    View Slide

  2. 1. Introduction: Hi, Diggy!

    View Slide

  3. What are graph databases
    Optimized for key-value lookups AND edge traversals ( joins ).

    View Slide

  4. What is Dgraph?
    Dgraph is an open source graph database built for web-scale production environments
    written entirely in Go.
    Fast
    Sharded and Distributed (Distributed Joins, Filters and Sorts)
    Horizontally scalable
    Consistent Replication via Raft
    Highly Available by design
    Fault tolerant

    View Slide

  5. Why build it?
    Any company doing anything smart is using graphs.
    No native scalable solution.

    View Slide

  6. Concepts
    SUBJECT - predicate -> OBJECT
    // Facts
    .
    .
    // Half facts
    .
    SUBJECT - predicate -> Value
    "Singapore" .

    View Slide

  7. Reception
    v0.1 release in Dec '15
    3200 Github stars
    32 contributors
    Next release: v0.8
    First page of HN multiple times.
    news.ycombinator.com/item?id=11322444 (https://news.ycombinator.com/item?id=11322444)

    View Slide

  8. 2. Benchmarks

    View Slide

  9. Neo4j
    Neo4j data loading
    Dgraph is 100x faster.
    Neo4j query
    Dgraph is 3x-6x faster on a read-write workload. At least as fast on read-only.
    Link to full benchmark code (https://github.com/dgraph-io/benchmarks/tree/master/data/neo4j)

    View Slide

  10. Cayley (with Bolt on Macbook)
    Cayley is a graph layer written in Go.
    Loading 21M RDFs
    Dgraph is 9.7x faster
    Queries
    Dgraph is 5x - 36.6x faster
    Link to full benchmark code (https://github.com/ankurayadav/graphdb-benchmarks)

    View Slide

  11. 3. Using a graph database

    View Slide

  12. Graph-y questions (1/2)
    Are they an overkill?
    Are they robust?
    Are they scalable?

    View Slide

  13. Graph-y questions (2/2)
    Are they an overkill? No
    Are they robust? Yes
    Are they scalable? Yes

    View Slide

  14. Your SQL DB is slowing you down (1/3)
    SQL is like C, Graph query language is like Go or Python.
    Just like high level languages allow you to express logic faster,
    Databases allowing more complex queries allow you to iterate on your application faster.
    They achieve this by cutting down your application code by at least half.

    View Slide

  15. Your SQL DB is slowing you down (2/3)
    What SQL forces you to do:
    Because SQL does lesser things, you need to do more things.
    Maintain a strict schema.
    Data duplication to avoid slow joins.
    Pre-compute counts for foreign tables to allow e cient sorting.

    View Slide

  16. Your SQL DB is slowing you down (3/3)
    What Dgraph does:
    Flexible sparse schema, with real time modi able data types.
    Join is a single lookup away, so no data duplication.
    Simplify schema.
    Any counts can be done cheaply in real time.

    View Slide

  17. 4. Demo: Run Stack Over ow on Dgraph

    View Slide

  18. User Schema
    SQL | Dgraph
    --- | ------
    Reputation | Reputation
    CreationDate | CreationDate
    DisplayName | DisplayName
    LastAccessDate | LastAccessDate
    Location | Location
    AboutMe | AboutMe
    Age | Age
    Views |
    UpVotes |
    DownVotes |

    View Slide

  19. Versioning (SQL, Dgraph)
    SQL | Dgraph
    --- | ------
    TypeId | Type
    PostId | Post (point to Post)
    CreationDate | CreationDate
    Author | Author
    Text | Text

    View Slide

  20. Post
    SQL | Dgraph
    --- | ------
    TypeId | Type
    ParentId | Has.Answer
    AcceptedAnswerId | Chosen.Answer
    Title (duplicate of versioning) | Title (point to Version)
    Body (duplicate of versioning) | Body (point to Version)
    Tags (duplicate of versioning) | Tags (point to Version)
    OwnerUserId | Owner (point to User)
    Score | Score
    LastEditorUserId |
    LastEditDate |
    LastActivityDate |
    CommentCount |
    AnswerCount |
    FavoriteCount |
    CreationDate |

    View Slide

  21. Comment, Vote
    SQL | Dgraph
    --- | ------
    PostId (Vote/Comment -> Post) | (New edge from Post -> Vote/Comment)
    Timestamp | Timestamp
    Author | Author
    VoteType (for Vote) | Score (for both Vote and Comment)
    Text (for Comment) | Text (for Comment)

    View Slide

  22. Dgraph Schema

    View Slide

  23. Live Demo

    View Slide

  24. Query 1: Top questions

    View Slide

  25. View Slide

  26. Query 2: Question Page

    View Slide

  27. 5. Final Thoughts

    View Slide

  28. When to use Dgraph, when SQL
    SQL has a great use case, which is to provide ACID transactions.
    Great solution to store nancial data.
    For everything else, I think a well designed graph database is more ideally suited.
    Cuts down on the amount of coding e ort required, using the power of graphs to run
    complex queries.

    View Slide

  29. Thank you
    Manish R Jain
    Dgraph Labs
    [email protected] (mailto:[email protected])
    https://dgraph.io (https://dgraph.io)

    View Slide

  30. View Slide