It's getting faster: from relational databases to MongoDB

It's getting faster: from relational databases to MongoDB

This is the English translation of a talk I gave at the JAX con 2012 (http://www.jax.de/2012).

At Fraunhofer IGD we are using relational databases to store large 3D city models. In the past, we realized that performance of relational databases is not good enough for such kind of data. We were thus seeking for better solutions and we eventually found MongoDB.

However, like most NoSQL databases, MongoDB does not support transactions. So, we implemented the Multiversion Concurrency Control (MVCC) paradigm on top of MongoDB to fill that gap. Our solution works completely lock-free and is almost as fast as plain MongoDB.

This talk gives some details about our application as well as insights into MongoMVCC's implementation. The library is available for free at GitHub:
https://github.com/igd-geo/mongomvcc

Bdcf8af7892cb0147cb22828d37e872f?s=128

Michel Krämer

April 17, 2012
Tweet

Transcript

  1. It's getting faster: moving from relational databases to MongoDB Michel

    Krämer
  2. What data?

  3. 3D city models

  4. Owner Street Number Metadata 3D geometry

  5. Heterogeneous data sources

  6. Municipality Urban planners Utility Environmental Emergency Citizens

  7. Why?

  8. Layer Feature Geometry Metadata 1 n n 1 n m

    n n 1 1 n 1 ...
  9. Mesh Face Geometry n 1 Vertex Color Texture coord. 1

    n 1 1 1 n n n ...
  10. Mesh Face Geometry n 1 Vertex Color Texture coord. 1

    n 1 1 1 n n n ... Blob
  11. MySQL PostgreSQL Throughput (relatively) Oracle

  12. Downtime during backups

  13. Downtime when scaling

  14. Hibernate Pro Contra

  15. Documents

  16. Geometry Metadata ... Building Geometry Metadata ... Building ...

  17. NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL

    NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL MongoDB CouchDB NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL
  18. MongoDB CouchDB

  19. MongoDB CouchDB fast stable community scalable ad-hoc queries fast stable

    community scalable ad-hoc queries ... ...
  20. MySQL MongoDB 800 200 400 600 Objects/s (read)

  21. MySQL MongoDB 500 125 250 375 Objects/s (write)

  22. MongoDB

  23. MongoDB update A update B

  24. MongoDB update A update B update A update C

  25. MongoDB update A update B update A update C ?

  26. 1 + 1 = 2

  27. MongoDB does not use [...] transactions with rollback, as it

    is designed to be lightweight and fast [...]. By keeping transaction support extremely simple, performance is enhanced [...]. MongoDB Developer FAQ
  28. MVCC

  29. snapshot client 1 ... snapshot client 2 ...

  30. CouchDB's approach doc A, rev 1 client 1 doc A,

    rev 2 client 2
  31. Git's approach File A C1

  32. Git's approach File B File A C1 C2

  33. Our approach snapshot (index) client 1 ... snapshot (index) client

    2 ... C1 C2 C1 C2
  34. doc A C1 index client 1

  35. doc B doc A C1 index client 1

  36. doc B doc A C1 C2 index client 1

  37. C1 C2 client 1 client 2

  38. C1 C2 client 1 client 2 C3

  39. C1 C2 client 1 client 2 C3 conflict!

  40. C1 C2 client 1 client 2 C3

  41. C1 C2 client 1 client 2 C3 C4

  42. C1 C2 C4 C3 C5 C6 master project a project

    b Named branches
  43. C1 C2 C3 Accessing the history client 1 client 2

  44. Downside Performance Memory API

  45. MongoDB MongoMVCC MVCC + = https://github.com/igd-geo/mongomvcc

  46. Michel Krämer Fraunhofer IGD michel.kraemer@igd.fraunhofer.de @michelkraemer +Michel Krämer