Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB as a Message Queue - Luke Gotszling, AO...

mongodb
December 12, 2011

MongoDB as a Message Queue - Luke Gotszling, AOL/About.me

After outgrowing an existing MQ solution we turned to Mongo. With just a few lines of code we were able to replace our old MQ cluster with a sharded Mongo deployment. In addition to the sharding ability, Mongo has options that allow us to tailor durability to the level we require. We are currently successfully processing millions of messages per day. This talk will feature side-by-side examples in both Python and the Mongo Shell.

mongodb

December 12, 2011
Tweet

More Decks by mongodb

Other Decks in Technology

Transcript

  1. MongoDB as A Message Queue Luke Gotszling Aol / About.me

    MongoSV Santa Clara, CA December 9, 2011
  2. The Rabbit in the Room • 3-node cluster, no disk

    persistence • Hard to diagnose cause of failure at scale • Other AMQP solutions
  3. Benefits • Async operations • Per message (document) atomicity •

    Batch processes • Periodic processes • Durability / ability to shard • Operational familiarity
  4. AMQP? Direct Topic Fanout AMQP Push Yes Yes Mongo Queue

    Poll Regular expression No* * Options include passing a message along with an incrementing key or multiple declarations ?
  5. To cap or not to cap • Capped collections[1] •

    Better performance but limited to single node[2] • FIFO • Uncapped collections -- rest of this presentation • Can shard, lower performance per-node • FIFO-ish[3], custom ordering available [1] http://blog.boxedice.com/2011/04/13/queueing-mongodb-using-mongodb/ http://blog.boxedice.com/2011/09/28/replacing-rabbitmq-with-mongodb/ [2] SERVER-211, SERVER-2654 [3] Only down to 1 second granularity
  6. Code (mongo) • Create: • Consume: • Index: db.messages.findAndModify( {

    query:{"queue":"email"}, sort:{"_id":+1}, remove:true} ) db.messages.insert( { queue:"email", payload:serialized_data} ) db.messages.ensureIndex({ queue:1, _id:1})
  7. Code (Python) • Create: • Consume: • Index: self.client.database.command("findandmodify", "messages",

    query={"queue": queue}, sort={"_id": pymongo.ASCENDING}, remove=True) self.client.insert({"payload": serialize(message), "queue": queue}) col.ensure_index([("queue", 1),("_id", 1)]) http://packages.python.org/kombu/
  8. Benchmarks (Single-Node) celery 2.4.5 / kombu 1.5.1 / pymongo 2.1

    / amqplib 1.0.2 Server Messages created Performance gain RabbitMQ v2.7.0 MongoDB v2.0.2-rc1 2116 / second - 2524 / second +19%
  9. Benchmarks (Single-Node) 1 10 100 1000 0 6 12 18

    24 30 Consumed / s (log scale) Concurrency RabbitMQ v2.7.0 MongoDB (2.0.2-rc1) --nojournal MongoDB (2.0.2-rc1) --journal
  10. Benchmarks (Single-Node) 0 200 400 600 800 0 6 12

    18 24 30 Consumed / s Concurrency RabbitMQ v2.7.0 MongoDB (2.0.2-rc1) --nojournal MongoDB (2.0.2-rc1) --journal
  11. Pros Cons • Familiar technology • Scale • Durability •

    Lower operational overhead • Advanced querying (map/reduce etc...) • Not AMQP • Need to poll • Performance depends on polling frequency and concurrency • Message consumption is a locking operation • Fewer libraries available[1] [1] Python has kombu