Asynchronous working with Python/Django

“Asynchronous working with Python/Django” Martin Alderete @alderetemartin [email protected]

Introduction to the issue Remember: Quicker is BETTER!

Introduction to the issue “Summary” Coupled to the Request- Response
cycle Can not move it to another host Prone to be a “bottleneck” Error handling is not DRY send_registration_email(user) its blocking It does not scale! The “heavy” work MUST be outside of the Request-Response cycle. Heavy: “Everything which could add an unnecessary delay, some overhead or it is not needed immediately”

Intro to message Brokers (DS) Distributed System Applicacion (producer) Broker
messages Abstraction Worker (consumer) messages

Intro to message Brokers (DS) Redis, RabbitMQ, MongoDB and more...

Intro to message Brokers (DS) Advantages Contains “FIFO-like” Queues Looks
like a DICT (key-value) Allows to decouple the system Allows to distribute the system Allows the communication between technologies Allows to scale in a “more” natural way disadvantages Now System == Distributed System Adds more complexity to the stack Needs more maintenance

Proposed solution: Hardcore... Hardcore = Python + Broker + handmade
worker

Hardcore way! Producer pip install redis redis_conn = redis.Redis(host='localhost', port=6379)

Hardcore way! Consumer

Hardcore way! Summary... Simple and complex Independent from the Producer
framework Decouple the system Move the heavy work to other place Absolute control from devs All should be done “from scratch” Stick to a single broker (Redis) Limited scalability Lots of code should be “re-written” Monitoring? Administration ?

Proposed solutions: Celery Celery is an asynchronous task queue/job queue
based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on a single or more worker servers.

Celery pip install celery Celery = Python + Broker +
Batteries included! www.amqp.org

Celery Application Entry point of everything related to Celery. Create
a single (aka “app”). http://docs.celeryproject.org/en/latest/django/first-steps-with-django.html

Tasks The base of every Celery application. Have 2 responsibilities:
Define what happen when a task is called. Define what to do when a worker receive the a task. Every task has a name. Basically callables objects with “magic”. By convention are placed in tasks.py Created using a decorador: @shared_task (from celery import shared_task)

Tasks They have many attributes which allow to define how
the task behave, for example: Task.name Task.bind Task.queue Task.max_retries Task.default_retry_delay Task.rate_limit Task.time_limit Task.soft_time_limit Task.ignore_result several more… http://celery.readthedocs.org/en/latest/reference/celery.app.task.html

Real world task (tasks.py)

Routing Routing, mechanism by which we can decide which Queue
should receive the message of a new task toward a worker. RECOMMENDED instead of hardcoded ‘queue’ on the TASK!!!

Tasks: Calling 2 ways: Using a shortcut and options defined
at the moment a task is created (@shared_task, @task). Task.delay(arg1, kwarg1=value1) Using the “long” way, It allows to customize a task call modifying the default options. Task.apply_async(args=l, kwargs=d, **options) http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.apply_async

Workers: Consuming tasks celery -A projName worker --loglevel=info celery -A
projName worker --concurrency=10 celery -A proj worker -P eventlet -c 1000 celery -A projName worker --autoscale=10,3 celery worker --help http://docs.celeryproject.org/en/latest/userguide/workers.html

AsyncResult Celery provide us (if possible) an AsyncResult (a future)
with the result of a task. To do this possible Celery uses a backend where it stores the result of the tasks. The backend is configured by CELERY_RESULT_BACKEND There are few available backends: cache (memcached), mongodb, redis, amqp, etc Each backend has its configuration. http://celery.readthedocs.org/en/latest/configuration.html#celery-result-backend

AsyncResult: API Few methods and attributes… http://celery.readthedocs.org/en/latest/reference/celery.result.html

AsyncResult: API (from a task_id) http://celery.readthedocs.org/en/latest/reference/celery.result.html

CeleryBeat It is a periodic tasks planner (Bye cron?). celery
-A projName beat

Celery Canvas Celery provides mechanisms to group, chain, add callbacks,
as well as process chunks. For this purpose Celery uses something called PRIMITIVES. group: Executes task in parallel.. chain: Links tasks, add callback ( f(g(a)) ). chord: A group plus a callback (Barrier). map: Similar to Python map(). chunks: Separates a list of elements in small parts. http://docs.celeryproject.org/en/latest/userguide/canvas.html

Monitoring: Celery Flower

Monitoring: Celery Flower Real time monitoring: Progress and historical. Details
about the tasks. Graphs and stats. Remote Control: Status and stats of the workers. Shutdown or reboot workers. Control autoscaling and pool size. See tasks execution status tareas. Queues administrations. ETC… pip install flower celery -A projName flower --port=5555

Thoughts and conclusions Messages Brokers: Came to stay Allow systems
with flexible architectures Allow communication between technologies AMQP is a good protocol (www.amqp.org). Distributed Systems: Are complex but scalable Add complexity to the stack Allow to distribute work loads Require maintenance/monitoring Harder to debug (more when multi-worker) More services but smaller (micro-services)

Thoughts and conclusions Celery: Is the framework for distributed systems
Is the framework that each Pythonista should test when play with DS. Is a mature projects with good support. Has a good documentation. Is simple to configure and run. Is a WORLD to learn and understand in deep. Could be extended “easily”. (signals, management commands, remotes). Has LOTS of settings and features Should be monitored as a normal service. Something that I do not know… =)

Extra: Distributed Locking REDIS https://gist.github.com/malderete/449cf92a16983c0bb412

Extra: Distributed Locking (Django’s cache)

Muito Obrigado ¿Perguntas? Martin Alderete @alderetemartin [email protected]

Muito Obrigado ¿Perguntas? while not manos.adormecida: aplaudir() Martin Alderete @alderetemartin
[email protected]

http://www.celeryproject.org/ http://redis.io/ http://www.rabbitmq.com/ http://www.amqp.org/ http://zookeeper.apache.org/ https://github.com/brolewis/celery_mutex https://gist.github.com/malderete Links....

Asynchronous working with Python/Django

Asynchronous working with Python/Django

Martin Alderete

More Decks by Martin Alderete

Other Decks in Programming

Featured

Transcript