Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Shortcuts Around the Mistakes I've Made Scaling MongoDB

Shortcuts Around the Mistakes I've Made Scaling MongoDB

Presentation given at MongoUK, September 2011

8c21306523b16ba5dd35c3549bf90994?s=128

Theo Hultberg

April 14, 2012
Tweet

Transcript

  1. SHORTCUTS AROUND THE MISTAKES I’VE MADE SCALING MONGODB Theo, Chief

    Architect at onsdag 21 september 11
  2. What we do We want to revolutionize the digital advertising

    industry by showing that there is more to ad analytics than click through rates. onsdag 21 september 11
  3. Ads onsdag 21 september 11

  4. Data onsdag 21 september 11

  5. Assembling sessions exposure ping ping ping ping ping event event

    ping session ➔ ➔ onsdag 21 september 11
  6. Crunching session session session session session session session session session

    session session session session ➔ ➔ 42 onsdag 21 september 11
  7. Reports onsdag 21 september 11

  8. What we do Track ads, make pretty reports. onsdag 21

    september 11
  9. That doesn’t sound so hard onsdag 21 september 11

  10. That doesn’t sound so hard We don’t know when sessions

    end onsdag 21 september 11
  11. That doesn’t sound so hard We don’t know when sessions

    end There’s a lot of data onsdag 21 september 11
  12. That doesn’t sound so hard We don’t know when sessions

    end There’s a lot of data It’s all done in (close to) real time onsdag 21 september 11
  13. Numbers onsdag 21 september 11

  14. Numbers 40 Gb data onsdag 21 september 11

  15. Numbers 40 Gb data 50 million documents onsdag 21 september

    11
  16. Numbers 40 Gb data 50 million documents per day onsdag

    21 september 11
  17. How we use MongoDB onsdag 21 september 11

  18. How we use MongoDB “Virtual memory” to offload data while

    we wait for sessions to finish onsdag 21 september 11
  19. How we use MongoDB “Virtual memory” to offload data while

    we wait for sessions to finish Short time storage (<48 hours) for batch jobs onsdag 21 september 11
  20. How we use MongoDB “Virtual memory” to offload data while

    we wait for sessions to finish Short time storage (<48 hours) for batch jobs Metrics storage onsdag 21 september 11
  21. Why we use MongoDB onsdag 21 september 11

  22. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas onsdag 21 september 11
  23. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas Sharding makes it possible to scale writes onsdag 21 september 11
  24. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas Sharding makes it possible to scale writes Secondary indexes and rich query language are great features (for the metrics store) onsdag 21 september 11
  25. Why we use MongoDB Schemalessness makes things so much easier,

    the data we collect changes as we come up with new ideas Sharding makes it possible to scale writes Secondary indexes and rich query language are great features (for the metrics store) It’s just… nice onsdag 21 september 11
  26. Btw. onsdag 21 september 11

  27. Btw. We use JRuby, it’s awesome onsdag 21 september 11

  28. A story in 7 iterations onsdag 21 september 11

  29. secondary indexes and updates 1st iteration onsdag 21 september 11

  30. secondary indexes and updates 1st iteration One document per session,

    update as new data comes along Outcome: 1000% write lock onsdag 21 september 11
  31. #1 Everything is about working around the GLOBAL WRITE LOCK

    onsdag 21 september 11
  32. MongoDB 2.0.0 db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true) db.coll.update({_id: "abc"},

    {$push: {x: “...”}}, true) onsdag 21 september 11
  33. MongoDB 1.8.1 db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true) db.coll.update({_id: "abc"},

    {$push: {x: “...”}}, true) onsdag 21 september 11
  34. using scans for two step assembling 2nd iteration Instead of

    updating, save each fragment, then scan over _id to assemble sessions onsdag 21 september 11
  35. using scans for two step assembling 2nd iteration Outcome: not

    as much lock, but still not great performance. We also realised we couldn’t remove data fast enough onsdag 21 september 11
  36. #2 Everything is about working around the GLOBAL WRITE LOCK

    onsdag 21 september 11
  37. #3 Give a lot of thought to your PRIMARY KEY

    onsdag 21 september 11
  38. partitioning 3rd iteration onsdag 21 september 11

  39. partitioning 3rd iteration We came up with the idea of

    partitioning the data by writing to a new collection every hour onsdag 21 september 11
  40. partitioning 3rd iteration We came up with the idea of

    partitioning the data by writing to a new collection every hour Outcome: lots of complicated code, lots of bugs, but we didn’t have to care about removing data onsdag 21 september 11
  41. #4 Make sure you can REMOVE OLD DATA onsdag 21

    september 11
  42. sharding 4th iteration onsdag 21 september 11

  43. sharding 4th iteration To get around the global write lock

    and get higher write performance we moved to a sharded cluster. onsdag 21 september 11
  44. sharding 4th iteration To get around the global write lock

    and get higher write performance we moved to a sharded cluster. Outcome: higher write performance, lots of problems, lots of ops time spent debugging onsdag 21 september 11
  45. #5 Everything is about working around the GLOBAL WRITE LOCK

    onsdag 21 september 11
  46. #6 SHARDING IS NOT A SILVER BULLET and it’s buggy,

    if you can, avoid it onsdag 21 september 11
  47. onsdag 21 september 11

  48. #7 IT WILL FAIL design for it onsdag 21 september

    11
  49. onsdag 21 september 11

  50. onsdag 21 september 11

  51. moving things to separate clusters 5th iteration onsdag 21 september

    11
  52. moving things to separate clusters 5th iteration We saw very

    different loads on the shards and realised we had databases with very different usage patterns, some that made autosharding not work. We moved these off the cluster. onsdag 21 september 11
  53. moving things to separate clusters 5th iteration We saw very

    different loads on the shards and realised we had databases with very different usage patterns, some that made autosharding not work. We moved these off the cluster. Outcome: a more balanced and stable cluster onsdag 21 september 11
  54. #8 Everything is about working around the GLOBAL WRITE LOCK

    onsdag 21 september 11
  55. #9 ONE DATABASE with one usage pattern PER CLUSTER onsdag

    21 september 11
  56. #10 MONITOR EVERYTHING look at your health graphs daily onsdag

    21 september 11
  57. monster machines 6th iteration onsdag 21 september 11

  58. monster machines 6th iteration We got new problems removing data

    and needed some room to breathe and think onsdag 21 september 11
  59. monster machines 6th iteration We got new problems removing data

    and needed some room to breathe and think Solution: upgraded the servers to High- Memory Quadruple Extra Large (with cheese). onsdag 21 september 11
  60. monster machines 6th iteration We got new problems removing data

    and needed some room to breathe and think Solution: upgraded the servers to High- Memory Quadruple Extra Large (with cheese). — I onsdag 21 september 11
  61. #11 Don’t try to scale up SCALE OUT onsdag 21

    september 11
  62. #12 When you’re out of ideas CALL THE EXPERTS onsdag

    21 september 11
  63. partitioning (again) and pre-chunking 7th iteration onsdag 21 september 11

  64. partitioning (again) and pre-chunking 7th iteration We rewrote the database

    layer to write to a new database each day, and we created all chunks in advance. We also decreased the size of our documents by a lot. onsdag 21 september 11
  65. partitioning (again) and pre-chunking 7th iteration We rewrote the database

    layer to write to a new database each day, and we created all chunks in advance. We also decreased the size of our documents by a lot. Outcome: no more problems removing data. onsdag 21 september 11
  66. #13 Smaller objects means a smaller database, and a smaller

    database means LESS RAM NEEDED onsdag 21 september 11
  67. #14 Give a lot of thought to your PRIMARY KEY

    onsdag 21 september 11
  68. #15 Everything is about working around the GLOBAL WRITE LOCK

    onsdag 21 september 11
  69. #16 Everything is about working around the GLOBAL WRITE LOCK

    onsdag 21 september 11
  70. KTHXBAI @iconara architecturalatrocities.com burtcorp.com onsdag 21 september 11

  71. Since we got time… onsdag 21 september 11

  72. Safe mode Tips onsdag 21 september 11

  73. Safe mode Tips Run every Nth insert in safe mode

    onsdag 21 september 11
  74. Safe mode Tips Run every Nth insert in safe mode

    This will give you warnings when bad things happen; like failovers onsdag 21 september 11
  75. Avoid bulk inserts Tips onsdag 21 september 11

  76. Avoid bulk inserts Tips Very dangerous if there’s a possibility

    of duplicate key errors onsdag 21 september 11
  77. EC2 Tips onsdag 21 september 11

  78. EC2 Tips You have three copies of your data, do

    you really need EBS? onsdag 21 september 11
  79. EC2 Tips You have three copies of your data, do

    you really need EBS? Instance store disks are included in the price and they have predictable performance. onsdag 21 september 11
  80. EC2 Tips You have three copies of your data, do

    you really need EBS? Instance store disks are included in the price and they have predictable performance. m1.xlarge comes with 1.7 TB of storage. onsdag 21 september 11