Upgrade to Pro — share decks privately, control downloads, hide ads and more …

It's getting faster: from relational databases to MongoDB

It's getting faster: from relational databases to MongoDB

This is the English translation of a talk I gave at the JAX con 2012 (http://www.jax.de/2012).

At Fraunhofer IGD we are using relational databases to store large 3D city models. In the past, we realized that performance of relational databases is not good enough for such kind of data. We were thus seeking for better solutions and we eventually found MongoDB.

However, like most NoSQL databases, MongoDB does not support transactions. So, we implemented the Multiversion Concurrency Control (MVCC) paradigm on top of MongoDB to fill that gap. Our solution works completely lock-free and is almost as fast as plain MongoDB.

This talk gives some details about our application as well as insights into MongoMVCC's implementation. The library is available for free at GitHub:
https://github.com/igd-geo/mongomvcc

Michel Krämer

April 17, 2012
Tweet

More Decks by Michel Krämer

Other Decks in Programming

Transcript

  1. It's getting faster: moving
    from relational databases
    to MongoDB
    Michel Krämer

    View full-size slide

  2. 3D city models

    View full-size slide

  3. Owner
    Street
    Number
    Metadata
    3D geometry

    View full-size slide

  4. Heterogeneous
    data sources

    View full-size slide

  5. Municipality Urban planners
    Utility
    Environmental
    Emergency Citizens

    View full-size slide

  6. Layer Feature
    Geometry
    Metadata
    1
    n n
    1
    n
    m
    n
    n
    1
    1
    n
    1
    ...

    View full-size slide

  7. Mesh
    Face
    Geometry
    n
    1
    Vertex
    Color
    Texture coord.
    1
    n
    1
    1
    1
    n
    n
    n
    ...

    View full-size slide

  8. Mesh
    Face
    Geometry
    n
    1
    Vertex
    Color
    Texture coord.
    1
    n
    1
    1
    1
    n
    n
    n
    ...
    Blob

    View full-size slide

  9. MySQL PostgreSQL
    Throughput
    (relatively)
    Oracle

    View full-size slide

  10. Downtime
    during backups

    View full-size slide

  11. Downtime
    when scaling

    View full-size slide

  12. Hibernate
    Pro Contra

    View full-size slide

  13. Geometry
    Metadata
    ...
    Building
    Geometry
    Metadata
    ...
    Building
    ...

    View full-size slide

  14. NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    MongoDB
    CouchDB
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL
    NoSQL

    View full-size slide

  15. MongoDB CouchDB

    View full-size slide

  16. MongoDB CouchDB
    fast
    stable
    community
    scalable
    ad-hoc queries
    fast
    stable
    community
    scalable
    ad-hoc queries
    ... ...

    View full-size slide

  17. MySQL MongoDB
    800
    200
    400
    600
    Objects/s (read)

    View full-size slide

  18. MySQL MongoDB
    500
    125
    250
    375
    Objects/s (write)

    View full-size slide

  19. MongoDB
    update A
    update B

    View full-size slide

  20. MongoDB
    update A
    update B
    update A
    update C

    View full-size slide

  21. MongoDB
    update A
    update B
    update A
    update C
    ?

    View full-size slide

  22. MongoDB does not use [...] transactions
    with rollback, as it is designed to be
    lightweight and fast [...].
    By keeping transaction support extremely
    simple, performance is enhanced [...].
    MongoDB Developer FAQ

    View full-size slide

  23. snapshot
    client 1 ...
    snapshot
    client 2 ...

    View full-size slide

  24. CouchDB's approach
    doc A,
    rev 1
    client 1
    doc A,
    rev 2
    client 2

    View full-size slide

  25. Git's approach
    File A
    C1

    View full-size slide

  26. Git's approach
    File B
    File A
    C1
    C2

    View full-size slide

  27. Our approach
    snapshot (index)
    client 1 ...
    snapshot (index)
    client 2 ...
    C1 C2
    C1 C2

    View full-size slide

  28. doc A
    C1
    index
    client 1

    View full-size slide

  29. doc B
    doc A
    C1
    index
    client 1

    View full-size slide

  30. doc B
    doc A
    C1
    C2
    index
    client 1

    View full-size slide

  31. C1
    C2
    client 1 client 2

    View full-size slide

  32. C1
    C2
    client 1 client 2
    C3

    View full-size slide

  33. C1
    C2
    client 1 client 2
    C3
    conflict!

    View full-size slide

  34. C1
    C2
    client 1 client 2
    C3

    View full-size slide

  35. C1
    C2
    client 1 client 2
    C3 C4

    View full-size slide

  36. C1
    C2
    C4
    C3
    C5
    C6
    master
    project a
    project b
    Named branches

    View full-size slide

  37. C1
    C2
    C3
    Accessing the history
    client 1
    client 2

    View full-size slide

  38. Downside
    Performance
    Memory
    API

    View full-size slide

  39. MongoDB
    MongoMVCC
    MVCC
    +
    =
    https://github.com/igd-geo/mongomvcc

    View full-size slide

  40. Michel Krämer
    Fraunhofer IGD
    [email protected]
    @michelkraemer
    +Michel Krämer

    View full-size slide