Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Chapman: Building a Distributed Job Queue in MongoDB

rick446
June 21, 2013

Chapman: Building a Distributed Job Queue in MongoDB

When you're building a web application, you want to respond to every request as quickly as possible. The usual approach is to use an asynchronous job queue like Sidekiq, Resque, Celery, RQ, or a number of other frameworks to handle those tasks outside the request/response cycle in a separate 'worker' process. Unfortunately, many of these frameworks either require the deployment of Redis, RabbitMQ, or some other request broker, or they resort to polling a database for new work to do. Chapman is a distributed task queue built on MongoDB that avoids gratuitous polling, using capped collections and tailable cursors to provide notifications of incoming work. Inspired by Celery, Chapman also supports task graphs, where multiple tasks that depend on each other can be executed by the system asynchronously. Come learn how Synapp.io is using MongoDB and Chapman to handle its core data processing needs.

rick446

June 21, 2013
Tweet

More Decks by rick446

Other Decks in Technology

Transcript

  1. Roadmap • Define the problem • Schema design & operations

    • Types of tasks • Reducing Polling Friday, June 21, 13
  2. Requirements Digital Ocean Rackspace US Rackspace UK SMTP Server SMTP

    Server SMTP Server SMTP Server SMTP Server SMTP Server App Server Friday, June 21, 13
  3. Basic Ideas msg Chapman Insecure msg msg msg msg msg

    Task Task Worker Process Friday, June 21, 13
  4. Job Queue Schema: Message { _id: ObjectId(...), task_id: ObjectId(...), slot:

    'run', s: { status: 'ready', ts: ISODateTime(...), q: 'chapman', pri: 10, w: '----------', }, args: Binary(...), kwargs: Binary(...), send_args: Binary(...), send_kwargs: Binary(...) } Friday, June 21, 13
  5. Job Queue Schema: Message { _id: ObjectId(...), task_id: ObjectId(...), slot:

    'run', s: { status: 'ready', ts: ISODateTime(...), q: 'chapman', pri: 10, w: '----------', }, args: Binary(...), kwargs: Binary(...), send_args: Binary(...), send_kwargs: Binary(...) } Destination Task Friday, June 21, 13
  6. Job Queue Schema: Message { _id: ObjectId(...), task_id: ObjectId(...), slot:

    'run', s: { status: 'ready', ts: ISODateTime(...), q: 'chapman', pri: 10, w: '----------', }, args: Binary(...), kwargs: Binary(...), send_args: Binary(...), send_kwargs: Binary(...) } Task method to be run Destination Task Friday, June 21, 13
  7. Job Queue Schema: Message { _id: ObjectId(...), task_id: ObjectId(...), slot:

    'run', s: { status: 'ready', ts: ISODateTime(...), q: 'chapman', pri: 10, w: '----------', }, args: Binary(...), kwargs: Binary(...), send_args: Binary(...), send_kwargs: Binary(...) } Task method to be run Destination Task Scheduling / Synchronization Friday, June 21, 13
  8. Job Queue Schema: Message { _id: ObjectId(...), task_id: ObjectId(...), slot:

    'run', s: { status: 'ready', ts: ISODateTime(...), q: 'chapman', pri: 10, w: '----------', }, args: Binary(...), kwargs: Binary(...), send_args: Binary(...), send_kwargs: Binary(...) } Task method to be run Destination Task Scheduling / Synchronization Message arguments Friday, June 21, 13
  9. Job Queue Schema: TaskState { _id: ObjectId(...), type: 'Group', parent_id:

    ObjectId(...), on_complete: ObjectId(...), mq: [ObjectId(...), ...], status: 'pending', options: { queue: 'chapman', priority: 10, immutable: false, ignore_result: true, } result: Binary(...), data: {...} } Friday, June 21, 13
  10. Job Queue Schema: TaskState { _id: ObjectId(...), type: 'Group', parent_id:

    ObjectId(...), on_complete: ObjectId(...), mq: [ObjectId(...), ...], status: 'pending', options: { queue: 'chapman', priority: 10, immutable: false, ignore_result: true, } result: Binary(...), data: {...} } Python class registered for task Friday, June 21, 13
  11. Job Queue Schema: TaskState { _id: ObjectId(...), type: 'Group', parent_id:

    ObjectId(...), on_complete: ObjectId(...), mq: [ObjectId(...), ...], status: 'pending', options: { queue: 'chapman', priority: 10, immutable: false, ignore_result: true, } result: Binary(...), data: {...} } Python class registered for task Parent task (if any) Friday, June 21, 13
  12. Job Queue Schema: TaskState { _id: ObjectId(...), type: 'Group', parent_id:

    ObjectId(...), on_complete: ObjectId(...), mq: [ObjectId(...), ...], status: 'pending', options: { queue: 'chapman', priority: 10, immutable: false, ignore_result: true, } result: Binary(...), data: {...} } Python class registered for task Parent task (if any) Message to be sent on completion Friday, June 21, 13
  13. Job Queue Schema: TaskState { _id: ObjectId(...), type: 'Group', parent_id:

    ObjectId(...), on_complete: ObjectId(...), mq: [ObjectId(...), ...], status: 'pending', options: { queue: 'chapman', priority: 10, immutable: false, ignore_result: true, } result: Binary(...), data: {...} } Python class registered for task Parent task (if any) Message to be sent on completion Enqueue messages on task Friday, June 21, 13
  14. Message State: Naive Approach Reserve Message Try to lock task

    Un- reserve Message Friday, June 21, 13
  15. Message State: Naive Approach Reserve Message Try to lock task

    Un- reserve Message Reserve Message Try to lock task Un- reserve Message Friday, June 21, 13
  16. Message State: Naive Approach Reserve Message Try to lock task

    Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Friday, June 21, 13
  17. Message State: Naive Approach Reserve Message Try to lock task

    Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Friday, June 21, 13
  18. Message State: Naive Approach Reserve Message Try to lock task

    Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Friday, June 21, 13
  19. Message State: Naive Approach Reserve Message Try to lock task

    Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Friday, June 21, 13
  20. Message State: Naive Approach Reserve Message Try to lock task

    Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Reserve Message Try to lock task Un- reserve Message Friday, June 21, 13
  21. Message States: Reserve Message • findAndModify (‘ready’) • s.state =>

    ‘q1’ • s.w => worker_id • $push _id onto task’s mq field • If msg is first in mq, s.state => ‘busy’ • Start processing • Otherwise, s.state => ‘q2’ ready q1 busy q2 next pending Friday, June 21, 13
  22. Message States: Reserve Message • findAndModify (‘next’) • s.state =>

    ‘busy’ • s.w => worker_id • start processing ready q1 busy q2 next pending Friday, June 21, 13
  23. Message States: Retire Message • findAndModify TaskState • $pull message

    _id from ‘mq’ • findAndModify new first message in ‘mq’ if its s.state is in [‘q1’, ‘q2’] • s.state => ‘next’ ready q1 busy q2 next pending Friday, June 21, 13
  24. Task States • States mainly advisory • success,failure transitions trigger

    on_complete message • ‘chained’ is a tail- call optimization pending active chained failure success Friday, June 21, 13
  25. Handling Problems • Determine “live” worker ids • Find all

    messages in busy or q1 for those workers and make them “ready” Friday, June 21, 13
  26. Basic Tasks: FunctionTask • Simplest task: run a function to

    completion, set the result to the return value • If a ‘Suspend’ exception is raised, move the task to ‘chained’ status • Other exceptions set task to ‘failure’, save traceback & exception @task(ignore_result=True, priority=50) def track(event, user_id=None, **kwargs): log.info('track(%s, %s...)', event, user_id) # ... Friday, June 21, 13
  27. Digression: Task Chaining • Task state set to ‘chained’ •

    New “Chain” task is created that will • Call the “next” task • When the “next” task completes, also complete the “current” task @task(ignore_result=True, priority=50) def function_task(*args, **kwargs): # ... Chain.call(some_other_task) Friday, June 21, 13
  28. Composite Tasks • on_complete message for each subtask with slot=retire_subtask,

    specifying subtask position & the result of the subtask • Different composite tasks implement ‘run’ and ‘retire_subtask’ differently task_state.update( { '_id': subtask_id }, { $set: { 'parent_id': composite_id, 'data.composite_position': position, 'options.ignore_result': false }} ) Friday, June 21, 13
  29. Composite Task: Pipeline • Run • Send a ‘run’ message

    to the subtask with position=0 • Retire_subtask(position, result) • Send a ‘run’ message with the previous result to the subtask with position = (position+1), OR retire the Pipeline if no more tasks Friday, June 21, 13
  30. Composite Task: Group/Barrier • Run • Send a ‘run’ message

    to all subtasks • Retire_subtask(position, result) • Decrement the num_waiting counter • If num_waiting is 0, retire the group • Collect subtask results (Group), complete group, delete subtasks Friday, June 21, 13
  31. Reducing Polling • Reserving messages is expensive • Use Pub/Sub

    system instead • Publish to the channel whenever a message is ready to be handled • Each worker subscribes to the channel • Workers only ‘poll’ when they have a chance of getting work Friday, June 21, 13
  32. Pub/Sub for MongoDB Capped Collection • Fixed size • Fast

    inserts • “Tailable” cursors Tailable Cursor Friday, June 21, 13
  33. Pub/Sub for MongoDB Capped Collection • Fixed size • Fast

    inserts • “Tailable” cursors Tailable Cursor Friday, June 21, 13
  34. Pub/Sub for MongoDB Capped Collection • Fixed size • Fast

    inserts • “Tailable” cursors Tailable Cursor Friday, June 21, 13
  35. Pub/Sub for MongoDB Capped Collection • Fixed size • Fast

    inserts • “Tailable” cursors Tailable Cursor Friday, June 21, 13
  36. Pub/Sub for MongoDB Capped Collection • Fixed size • Fast

    inserts • “Tailable” cursors Tailable Cursor Friday, June 21, 13
  37. Pub/Sub for MongoDB Capped Collection • Fixed size • Fast

    inserts • “Tailable” cursors Tailable Cursor Friday, June 21, 13
  38. Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True): options =

    { 'tailable': True } if await_data: options['await_data'] = True cur = collection.find( { 'k': topic_re }, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur Friday, June 21, 13
  39. Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True): options =

    { 'tailable': True } if await_data: options['await_data'] = True cur = collection.find( { 'k': topic_re }, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur Make cursor tailable Friday, June 21, 13
  40. Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True): options =

    { 'tailable': True } if await_data: options['await_data'] = True cur = collection.find( { 'k': topic_re }, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur Holds open cursor for a while Make cursor tailable Friday, June 21, 13
  41. Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True): options =

    { 'tailable': True } if await_data: options['await_data'] = True cur = collection.find( { 'k': topic_re }, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur Holds open cursor for a while Make cursor tailable Don’t use indexes Friday, June 21, 13
  42. Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True): options =

    { 'tailable': True } if await_data: options['await_data'] = True cur = collection.find( { 'k': topic_re }, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur import re, time while True: cur = get_cursor( db.capped_collection, re.compile('^foo'), await_data=True) for msg in cur: do_something(msg) time.sleep(0.1) Holds open cursor for a while Make cursor tailable Don’t use indexes Friday, June 21, 13
  43. Getting a Tailable Cursor def get_cursor(collection, topic_re, await_data=True): options =

    { 'tailable': True } if await_data: options['await_data'] = True cur = collection.find( { 'k': topic_re }, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur import re, time while True: cur = get_cursor( db.capped_collection, re.compile('^foo'), await_data=True) for msg in cur: do_something(msg) time.sleep(0.1) Holds open cursor for a while Make cursor tailable Don’t use indexes Still some polling when no producer, so don’t spin too fast Friday, June 21, 13
  44. Building in retry... def get_cursor(collection, topic_re, last_id=-1, await_data=True): options =

    { 'tailable': True } spec = { 'id': { '$gt': last_id }, # only new messages 'k': topic_re } if await_data: options['await_data'] = True cur = collection.find(spec, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur Friday, June 21, 13
  45. Building in retry... def get_cursor(collection, topic_re, last_id=-1, await_data=True): options =

    { 'tailable': True } spec = { 'id': { '$gt': last_id }, # only new messages 'k': topic_re } if await_data: options['await_data'] = True cur = collection.find(spec, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes return cur Integer autoincrement “id” Friday, June 21, 13
  46. Building auto-increment class Sequence(object): ... def next(self, sname, inc=1): doc

    = self._db[self._name].find_and_modify( query={'_id': sname}, update={'$inc': { 'value': inc } }, upsert=True, new=True) return doc['value'] Friday, June 21, 13
  47. Building auto-increment class Sequence(object): ... def next(self, sname, inc=1): doc

    = self._db[self._name].find_and_modify( query={'_id': sname}, update={'$inc': { 'value': inc } }, upsert=True, new=True) return doc['value'] Atomically $inc the dedicated document Friday, June 21, 13
  48. Ludicrous Speed from pymongo.cursor import _QUERY_OPTIONS def get_cursor(collection, topic_re, last_id=-1,

    await_data=True): options = { 'tailable': True } spec = { 'ts': { '$gt': last_id }, # only new messages 'k': topic_re } if await_data: options['await_data'] = True cur = collection.find(spec, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes if await: cur = cur.add_option(_QUERY_OPTIONS['oplog_replay']) return cur Friday, June 21, 13
  49. Ludicrous Speed from pymongo.cursor import _QUERY_OPTIONS def get_cursor(collection, topic_re, last_id=-1,

    await_data=True): options = { 'tailable': True } spec = { 'ts': { '$gt': last_id }, # only new messages 'k': topic_re } if await_data: options['await_data'] = True cur = collection.find(spec, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes if await: cur = cur.add_option(_QUERY_OPTIONS['oplog_replay']) return cur id ==> ts Friday, June 21, 13
  50. Ludicrous Speed from pymongo.cursor import _QUERY_OPTIONS def get_cursor(collection, topic_re, last_id=-1,

    await_data=True): options = { 'tailable': True } spec = { 'ts': { '$gt': last_id }, # only new messages 'k': topic_re } if await_data: options['await_data'] = True cur = collection.find(spec, **options) cur = cur.hint([('$natural', 1)]) # ensure we don't use any indexes if await: cur = cur.add_option(_QUERY_OPTIONS['oplog_replay']) return cur id ==> ts Co-opt the oplog_replay option Friday, June 21, 13