Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Shortcuts Around the Mistakes I've Made Scaling...

Shortcuts Around the Mistakes I've Made Scaling MongoDB

Presentation given at MongoUK, September 2011

Theo Hultberg

April 14, 2012
Tweet

More Decks by Theo Hultberg

Other Decks in Programming

Transcript

  1. What we do We want to revolutionize the digital advertising

    industry by showing that there is more to ad analytics than click through rates. onsdag 21 september 11
  2. Assembling sessions exposure ping ping ping ping ping event event

    ping session ➔ ➔ onsdag 21 september 11
  3. Crunching session session session session session session session session session

    session session session session ➔ ➔ 42 onsdag 21 september 11
  4. That doesn’t sound so hard We don’t know when sessions

    end There’s a lot of data onsdag 21 september 11
  5. That doesn’t sound so hard We don’t know when sessions

    end There’s a lot of data It’s all done in (close to) real time onsdag 21 september 11
  6. How we use MongoDB “Virtual memory” to offload data while

    we wait for sessions to finish onsdag 21 september 11
  7. How we use MongoDB “Virtual memory” to offload data while

    we wait for sessions to finish Short time storage (<48 hours) for batch jobs onsdag 21 september 11
  8. How we use MongoDB “Virtual memory” to offload data while

    we wait for sessions to finish Short time storage (<48 hours) for batch jobs Metrics storage onsdag 21 september 11
  9. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas onsdag 21 september 11
  10. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas Sharding makes it possible to scale writes onsdag 21 september 11
  11. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas Sharding makes it possible to scale writes Secondary indexes and rich query language are great features (for the metrics store) onsdag 21 september 11
  12. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas Sharding makes it possible to scale writes Secondary indexes and rich query language are great features (for the metrics store) It’s just… nice onsdag 21 september 11
  13. secondary indexes and updates 1st iteration One document per session,

    update as new data comes along Outcome: 1000% write lock onsdag 21 september 11
  14. using scans for two step assembling 2nd iteration Instead of

    updating, save each fragment, then scan over _id to assemble sessions onsdag 21 september 11
  15. using scans for two step assembling 2nd iteration Outcome: not

    as much lock, but still not great performance. We also realised we couldn’t remove data fast enough onsdag 21 september 11
  16. partitioning 3rd iteration We came up with the idea of

    partitioning the data by writing to a new collection every hour onsdag 21 september 11
  17. partitioning 3rd iteration We came up with the idea of

    partitioning the data by writing to a new collection every hour Outcome: lots of complicated code, lots of bugs, but we didn’t have to care about removing data onsdag 21 september 11
  18. sharding 4th iteration To get around the global write lock

    and get higher write performance we moved to a sharded cluster. onsdag 21 september 11
  19. sharding 4th iteration To get around the global write lock

    and get higher write performance we moved to a sharded cluster. Outcome: higher write performance, lots of problems, lots of ops time spent debugging onsdag 21 september 11
  20. #6 SHARDING IS NOT A SILVER BULLET and it’s buggy,

    if you can, avoid it onsdag 21 september 11
  21. moving things to separate clusters 5th iteration We saw very

    different loads on the shards and realised we had databases with very different usage patterns, some that made autosharding not work. We moved these off the cluster. onsdag 21 september 11
  22. moving things to separate clusters 5th iteration We saw very

    different loads on the shards and realised we had databases with very different usage patterns, some that made autosharding not work. We moved these off the cluster. Outcome: a more balanced and stable cluster onsdag 21 september 11
  23. monster machines 6th iteration We got new problems removing data

    and needed some room to breathe and think onsdag 21 september 11
  24. monster machines 6th iteration We got new problems removing data

    and needed some room to breathe and think Solution: upgraded the servers to High- Memory Quadruple Extra Large (with cheese). onsdag 21 september 11
  25. monster machines 6th iteration We got new problems removing data

    and needed some room to breathe and think Solution: upgraded the servers to High- Memory Quadruple Extra Large (with cheese). — I onsdag 21 september 11
  26. partitioning (again) and pre-chunking 7th iteration We rewrote the database

    layer to write to a new database each day, and we created all chunks in advance. We also decreased the size of our documents by a lot. onsdag 21 september 11
  27. partitioning (again) and pre-chunking 7th iteration We rewrote the database

    layer to write to a new database each day, and we created all chunks in advance. We also decreased the size of our documents by a lot. Outcome: no more problems removing data. onsdag 21 september 11
  28. #13 Smaller objects means a smaller database, and a smaller

    database means LESS RAM NEEDED onsdag 21 september 11
  29. Safe mode Tips Run every Nth insert in safe mode

    This will give you warnings when bad things happen; like failovers onsdag 21 september 11
  30. Avoid bulk inserts Tips Very dangerous if there’s a possibility

    of duplicate key errors onsdag 21 september 11
  31. EC2 Tips You have three copies of your data, do

    you really need EBS? onsdag 21 september 11
  32. EC2 Tips You have three copies of your data, do

    you really need EBS? Instance store disks are included in the price and they have predictable performance. onsdag 21 september 11
  33. EC2 Tips You have three copies of your data, do

    you really need EBS? Instance store disks are included in the price and they have predictable performance. m1.xlarge comes with 1.7 TB of storage. onsdag 21 september 11