Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Celery

dbrgn
December 04, 2014

Introduction to Celery

A short presentation I held at the Zurich Python User Group in December 2014. It introduces Celery and the different task workflows (canvas) that can be implemented with it.

dbrgn

December 04, 2014
Tweet

More Decks by dbrgn

Other Decks in Technology

Transcript

  1. Message Queues “Put some stuff into one end of a

    queue, get it out at the other end”
  2. Celery • A task queueing system • Distribute tasks across

    systems • Written in Python, but the protocol can be implemented in any language
  3. When To Use Celery • Asynchronous task processing • Long

    running background jobs • Offloading heavy web backend operations • Scheduling tasks • ...
  4. Celery: The Broker The broker is the part of the

    system that does message distribution. Supported transports: • RabbitMQ • Redis • MongoDB (exp) • ZeroMQ (exp) • CouchDB (exp) • SQLAlchemy (exp) • Django ORM (exp) • Amazon SQS (exp) • ...and more
  5. Celery: Result Stores A result store stores the result of

    a task. It is optional. Supported stores: • AMQP • Redis • memcached • MongoDB • SQLAlchemy • Django ORM • Apache Cassandra
  6. Celery: Serializers The serialization is necessary to turn Python data

    types into a format that can be stored in the queue. The serialized data can be compressed and signed cryptographically. Supported serializers: • Pickle • JSON • YAML • MessagePack
  7. Server 1 Celery: Workers The workers take tasks out of

    the queues, process them and return the result. Transport Worker 1 Worker 2 Server 2 Worker 3
  8. Celery: Worker Concurrency Celery workers allow the following forms of

    concurrency: • Prefork (multiprocessing) • Eventlet, Gevent • Threading
  9. Celery: Other Features • Real time monitoring • Workflows (Grouping,

    chaining, chunking…) • Time & Rate Limits • Scheduling (goodbye cron!) • Autoreloading • Autoscaling • Resource Leak Protection
  10. Celery: Framework Integration • Django • Pyramid • Pylons •

    Flask • web2py • Tornado ...or use it directly!
  11. Celery: Defining a Task from celery import Celery app =

    Celery('hello', broker='redis://localhost/') @app.task def add(x, y): return x + y
  12. You can call a task synchronously like a regular function,

    without Celery: >>> add(19, 23) 42 Or you can call it asynchronously through Celery: >>> add.delay(19, 23) <AsyncResult: 257d9f84-cd4b-4248-b7ba-93ed24ce6804> Celery: Calling a Task
  13. The “delay” method is actually a shortcut for the “apply_async”

    method: >>> add.apply_async(args=(19, 23)) <AsyncResult: df5abf5e-737a-4a36-acd2-c0ffe8bad305> You can also delay the execution of a task (and many other things): >>> add.apply_async((19, 23), countdown=10) <AsyncResult: def99720-ea55-4280-957b-23bbcbc7646c> Celery: Calling a Task
  14. To block until the result is ready, use “.get()”: >>>

    add.delay(19, 23).get() 42 Or you can poll the result asynchronously: >>> res = add.delay(19, 23) >>> res.status, res.ready(), res.result 'SUCCESS', True, 42 Celery: Retrieving a Result
  15. Designing Workflows • Celery allows you to create workflows •

    Works by passing around call signatures • Primitives: Chains, Groups, Chords, Maps, Starmaps, Chunks
  16. Call Signatures: Creation Signatures are like a recipe on how

    to call a function. You can manually create a signature for a task: >>> from celery import signature >>> signature('tasks.add', args=(2, 2), countdown=10) tasks.add(2, 2)
  17. Call Signatures: Shortcuts Signatures are often nicknamed “subtasks”, because they

    are essentially tasks that execute tasks. >>> add.subtask((2, 2), countdown=10) tasks.add(2, 2) There is also a shortcut using star arguments. >>> add.s(2, 2, countdown=10) tasks.add(2, 2)
  18. Signatures support the “Calling API” like regular tasks. >>> s

    = add.s(19, 23).delay().get() 42 Calling a Signature
  19. Groups can be used to execute several tasks in parallel:

    >>> res = group(add.s(2, 2), add.s(4, 4)) >>> res.get() [4, 8] Groups
  20. Chains feed the result from one subtask into the next

    one: >>> # (4 + 4) * 8 * 10 >>> res = chain(add.s(4, 4), mul.s(8), mul.s(10)) >>> res.get() 640 Chains
  21. Chords are groups with a callback: >>> # sum(2+2, 3+3,

    4+4) >>> c = chord(add.s(2, 2), add.s(3, 3), add.s(4, 4)) >>> res = c(sum.s()) >>> res.get() 18 Chords
  22. Chunking lets you divide an iterable of work into pieces:

    >>> sig = add.chunks(zip(range(100), range(100)), 10) >>> res = sig() >>> sig.delay().get() [[0, 2, 4, 6, 8, 10, 12, 14, 16, 18], [20, 22, 24, 26, 28, 30, 32, 34, 36, 38], ... Chunks
  23. For the following example, we’re using the following three tasks:

    Example: Combinations @app.task def add(x, y): return x + y @app.task def flatten(lists): return itertools.chain.from_iterable(lists) @app.task def sum_list(numbers): return sum(numbers)
  24. Example: Combinations (1/4) The first signature returns 10 chunks of

    10 add operations. >>> chunks = add.chunks(zip(range(100), range(100)), 10) We can convert these chunks to a celery group: >>> sig1 = chunks.group()
  25. Example: Combinations (2/4) The first signature will return a list

    of 10 lists containing 10 numbers each. We want to flatten this to a single list containing all 100 numbers. >>> sig2 = flatten.s() Finally we want to sum all numbers in that list: >>> sig3 = sum_list.s()
  26. Example: Combinations (3/4) We can now run these three signatures

    in a chain: >>> sig = chain(sig1, sig2, sig3) There’s also a shortcut syntax for this: >>> sig = sig1 | sig2 | sig3
  27. Example: Combinations (4/4) The sig object is now still just

    a signature, a list of instructions on how to calculate the result. Nothing has been evaluated yet. Now let’s actually run those signatures and get the final result: >>> sig.delay().get() 9900
  28. Preview: Dynamic Tasks The next version of Celery will support

    dynamic tasks – modifying a task in a chord or group at runtime. • replace(sig): Replace the current task with a new task inheriting the same task id. • add_to_chord(sig): Add a signature to the chord the current task is a member of.