Slide 1

Slide 1 text

Introduction to Celery Zürich Python User Group 2014-12-04

Slide 2

Slide 2 text

Message Oriented Middleware “Software or hardware infrastructure supporting sending and receiving messages between distributed systems”

Slide 3

Slide 3 text

Message Queues “Put some stuff into one end of a queue, get it out at the other end”

Slide 4

Slide 4 text

Architecture: Pipe Producer Consumer Queue

Slide 5

Slide 5 text

Architecture: Multiple producers / consumers Producer Consumer Queue Producer Consumer Consumer

Slide 6

Slide 6 text

Architecture: Multiple queues / Routing Consumer Queue 1 Producer Consumer Consumer Queue 2 Router

Slide 7

Slide 7 text

Celery ● A task queueing system ● Distribute tasks across systems ● Written in Python, but the protocol can be implemented in any language

Slide 8

Slide 8 text

When To Use Celery ● Asynchronous task processing ● Long running background jobs ● Offloading heavy web backend operations ● Scheduling tasks ● ...

Slide 9

Slide 9 text

Celery: The Broker The broker is the part of the system that does message distribution. Supported transports: ● RabbitMQ ● Redis ● MongoDB (exp) ● ZeroMQ (exp) ● CouchDB (exp) ● SQLAlchemy (exp) ● Django ORM (exp) ● Amazon SQS (exp) ● ...and more

Slide 10

Slide 10 text

Celery: Result Stores A result store stores the result of a task. It is optional. Supported stores: ● AMQP ● Redis ● memcached ● MongoDB ● SQLAlchemy ● Django ORM ● Apache Cassandra

Slide 11

Slide 11 text

Celery: Serializers The serialization is necessary to turn Python data types into a format that can be stored in the queue. The serialized data can be compressed and signed cryptographically. Supported serializers: ● Pickle ● JSON ● YAML ● MessagePack

Slide 12

Slide 12 text

Server 1 Celery: Workers The workers take tasks out of the queues, process them and return the result. Transport Worker 1 Worker 2 Server 2 Worker 3

Slide 13

Slide 13 text

Celery: Worker Concurrency Celery workers allow the following forms of concurrency: ● Prefork (multiprocessing) ● Eventlet, Gevent ● Threading

Slide 14

Slide 14 text

Celery: Other Features ● Real time monitoring ● Workflows (Grouping, chaining, chunking…) ● Time & Rate Limits ● Scheduling (goodbye cron!) ● Autoreloading ● Autoscaling ● Resource Leak Protection

Slide 15

Slide 15 text

Celery: Framework Integration ● Django ● Pyramid ● Pylons ● Flask ● web2py ● Tornado ...or use it directly!

Slide 16

Slide 16 text

Celery: Installing $ pip install celery[redis,msgpack,threads]

Slide 17

Slide 17 text

Celery: Defining a Task from celery import Celery app = Celery('hello', broker='redis://localhost/') @app.task def add(x, y): return x + y

Slide 18

Slide 18 text

You can call a task synchronously like a regular function, without Celery: >>> add(19, 23) 42 Or you can call it asynchronously through Celery: >>> add.delay(19, 23) Celery: Calling a Task

Slide 19

Slide 19 text

The “delay” method is actually a shortcut for the “apply_async” method: >>> add.apply_async(args=(19, 23)) You can also delay the execution of a task (and many other things): >>> add.apply_async((19, 23), countdown=10) Celery: Calling a Task

Slide 20

Slide 20 text

To block until the result is ready, use “.get()”: >>> add.delay(19, 23).get() 42 Or you can poll the result asynchronously: >>> res = add.delay(19, 23) >>> res.status, res.ready(), res.result 'SUCCESS', True, 42 Celery: Retrieving a Result

Slide 21

Slide 21 text

Designing Workflows ● Celery allows you to create workflows ● Works by passing around call signatures ● Primitives: Chains, Groups, Chords, Maps, Starmaps, Chunks

Slide 22

Slide 22 text

Call Signatures: Creation Signatures are like a recipe on how to call a function. You can manually create a signature for a task: >>> from celery import signature >>> signature('tasks.add', args=(2, 2), countdown=10) tasks.add(2, 2)

Slide 23

Slide 23 text

Call Signatures: Shortcuts Signatures are often nicknamed “subtasks”, because they are essentially tasks that execute tasks. >>> add.subtask((2, 2), countdown=10) tasks.add(2, 2) There is also a shortcut using star arguments. >>> add.s(2, 2, countdown=10) tasks.add(2, 2)

Slide 24

Slide 24 text

Signatures support the “Calling API” like regular tasks. >>> s = add.s(19, 23).delay().get() 42 Calling a Signature

Slide 25

Slide 25 text

Groups can be used to execute several tasks in parallel: >>> res = group(add.s(2, 2), add.s(4, 4)) >>> res.get() [4, 8] Groups

Slide 26

Slide 26 text

Chains feed the result from one subtask into the next one: >>> # (4 + 4) * 8 * 10 >>> res = chain(add.s(4, 4), mul.s(8), mul.s(10)) >>> res.get() 640 Chains

Slide 27

Slide 27 text

Chords are groups with a callback: >>> # sum(2+2, 3+3, 4+4) >>> c = chord(add.s(2, 2), add.s(3, 3), add.s(4, 4)) >>> res = c(sum.s()) >>> res.get() 18 Chords

Slide 28

Slide 28 text

Chunking lets you divide an iterable of work into pieces: >>> sig = add.chunks(zip(range(100), range(100)), 10) >>> res = sig() >>> sig.delay().get() [[0, 2, 4, 6, 8, 10, 12, 14, 16, 18], [20, 22, 24, 26, 28, 30, 32, 34, 36, 38], ... Chunks

Slide 29

Slide 29 text

For the following example, we’re using the following three tasks: Example: Combinations @app.task def add(x, y): return x + y @app.task def flatten(lists): return itertools.chain.from_iterable(lists) @app.task def sum_list(numbers): return sum(numbers)

Slide 30

Slide 30 text

Example: Combinations (1/4) The first signature returns 10 chunks of 10 add operations. >>> chunks = add.chunks(zip(range(100), range(100)), 10) We can convert these chunks to a celery group: >>> sig1 = chunks.group()

Slide 31

Slide 31 text

Example: Combinations (2/4) The first signature will return a list of 10 lists containing 10 numbers each. We want to flatten this to a single list containing all 100 numbers. >>> sig2 = flatten.s() Finally we want to sum all numbers in that list: >>> sig3 = sum_list.s()

Slide 32

Slide 32 text

Example: Combinations (3/4) We can now run these three signatures in a chain: >>> sig = chain(sig1, sig2, sig3) There’s also a shortcut syntax for this: >>> sig = sig1 | sig2 | sig3

Slide 33

Slide 33 text

Example: Combinations (4/4) The sig object is now still just a signature, a list of instructions on how to calculate the result. Nothing has been evaluated yet. Now let’s actually run those signatures and get the final result: >>> sig.delay().get() 9900

Slide 34

Slide 34 text

Preview: Dynamic Tasks The next version of Celery will support dynamic tasks – modifying a task in a chord or group at runtime. ● replace(sig): Replace the current task with a new task inheriting the same task id. ● add_to_chord(sig): Add a signature to the chord the current task is a member of.

Slide 35

Slide 35 text

Questions?