Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Asynchronous working with Python/Django

Asynchronous working with Python/Django

Show the concept behind background processing by using Pub/Sub, AMQP or similar then introduce Celery.

Martin Alderete

November 10, 2015
Tweet

More Decks by Martin Alderete

Other Decks in Programming

Transcript

  1. “Asynchronous working with
    Python/Django”
    Martin Alderete
    @alderetemartin
    [email protected]

    View Slide

  2. Introduction to the issue
    Remember: Quicker is BETTER!

    View Slide

  3. Introduction to the issue
    “Summary”
    Coupled to the Request- Response cycle
    Can not move it to another host
    Prone to be a “bottleneck”
    Error handling is not DRY
    send_registration_email(user) its blocking
    It does not scale!
    The “heavy” work MUST be outside of the
    Request-Response cycle.
    Heavy: “Everything which could add an unnecessary
    delay, some overhead or it is not needed immediately”

    View Slide

  4. Intro to message Brokers (DS)
    Distributed System
    Applicacion
    (producer)
    Broker
    messages
    Abstraction
    Worker
    (consumer)
    messages

    View Slide

  5. Intro to message Brokers (DS)
    Redis, RabbitMQ, MongoDB and more...

    View Slide

  6. Intro to message Brokers (DS)
    Advantages
    Contains “FIFO-like” Queues
    Looks like a DICT (key-value)
    Allows to decouple the system
    Allows to distribute the system
    Allows the communication between technologies
    Allows to scale in a “more” natural way
    disadvantages
    Now System == Distributed System
    Adds more complexity to the stack
    Needs more maintenance

    View Slide

  7. Proposed solution: Hardcore...
    Hardcore = Python + Broker + handmade worker

    View Slide

  8. Hardcore way! Producer
    pip install redis
    redis_conn = redis.Redis(host='localhost', port=6379)

    View Slide

  9. Hardcore way! Consumer

    View Slide

  10. Hardcore way!
    Summary...
    Simple and complex
    Independent from the Producer framework
    Decouple the system
    Move the heavy work to other place
    Absolute control from devs
    All should be done “from scratch”
    Stick to a single broker (Redis)
    Limited scalability
    Lots of code should be “re-written”
    Monitoring?
    Administration ?

    View Slide

  11. Proposed solutions: Celery
    Celery is an asynchronous task queue/job queue
    based on distributed message passing. It is focused
    on real-time operation, but supports scheduling as
    well.
    The execution units, called tasks, are executed
    concurrently on a single or more worker servers.

    View Slide

  12. Celery
    pip install celery
    Celery = Python + Broker + Batteries included!
    www.amqp.org

    View Slide

  13. Celery Application
    Entry point of everything related to Celery.
    Create a single (aka “app”).
    http://docs.celeryproject.org/en/latest/django/first-steps-with-django.html

    View Slide

  14. Tasks
    The base of every Celery application.
    Have 2 responsibilities:
    Define what happen when a task is called.
    Define what to do when a worker receive the a task.
    Every task has a name.
    Basically callables objects with “magic”.
    By convention are placed in tasks.py
    Created using a decorador:
    @shared_task (from celery import shared_task)

    View Slide

  15. Tasks
    They have many attributes which allow to define how
    the task behave, for example:
    Task.name
    Task.bind
    Task.queue
    Task.max_retries
    Task.default_retry_delay
    Task.rate_limit
    Task.time_limit
    Task.soft_time_limit
    Task.ignore_result
    several more…
    http://celery.readthedocs.org/en/latest/reference/celery.app.task.html

    View Slide

  16. Real world task (tasks.py)

    View Slide

  17. Routing
    Routing, mechanism by which we can decide which
    Queue should receive the message of a new task toward
    a worker.
    RECOMMENDED instead of hardcoded ‘queue’ on the
    TASK!!!

    View Slide

  18. Tasks: Calling
    2 ways:
    Using a shortcut and options defined at the moment a
    task is created (@shared_task, @task).
    Task.delay(arg1, kwarg1=value1)
    Using the “long” way, It allows to customize a task call
    modifying the default options.
    Task.apply_async(args=l, kwargs=d, **options)
    http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.apply_async

    View Slide

  19. Workers: Consuming tasks
    celery -A projName worker --loglevel=info
    celery -A projName worker --concurrency=10
    celery -A proj worker -P eventlet -c 1000
    celery -A projName worker --autoscale=10,3
    celery worker --help
    http://docs.celeryproject.org/en/latest/userguide/workers.html

    View Slide

  20. AsyncResult
    Celery provide us (if possible) an AsyncResult (a future)
    with the result of a task.
    To do this possible Celery uses a backend where it
    stores the result of the tasks.
    The backend is configured by
    CELERY_RESULT_BACKEND
    There are few available backends:
    cache (memcached), mongodb, redis, amqp, etc
    Each backend has its configuration.
    http://celery.readthedocs.org/en/latest/configuration.html#celery-result-backend

    View Slide

  21. AsyncResult: API
    Few methods and attributes…
    http://celery.readthedocs.org/en/latest/reference/celery.result.html

    View Slide

  22. AsyncResult: API (from a task_id)
    http://celery.readthedocs.org/en/latest/reference/celery.result.html

    View Slide

  23. CeleryBeat
    It is a periodic tasks planner (Bye cron?).
    celery -A projName beat

    View Slide

  24. Celery Canvas
    Celery provides mechanisms to group, chain, add
    callbacks, as well as process chunks.
    For this purpose Celery uses something called
    PRIMITIVES.
    group: Executes task in parallel..
    chain: Links tasks, add callback ( f(g(a)) ).
    chord: A group plus a callback (Barrier).
    map: Similar to Python map().
    chunks: Separates a list of elements in small parts.
    http://docs.celeryproject.org/en/latest/userguide/canvas.html

    View Slide

  25. Monitoring: Celery Flower

    View Slide

  26. Monitoring: Celery Flower
    Real time monitoring:
    Progress and historical.
    Details about the tasks.
    Graphs and stats.
    Remote Control:
    Status and stats of the workers.
    Shutdown or reboot workers.
    Control autoscaling and pool size.
    See tasks execution status tareas.
    Queues administrations.
    ETC…
    pip install flower
    celery -A projName flower --port=5555

    View Slide

  27. Thoughts and conclusions
    Messages Brokers:
    Came to stay
    Allow systems with flexible architectures
    Allow communication between technologies
    AMQP is a good protocol (www.amqp.org).
    Distributed Systems:
    Are complex but scalable
    Add complexity to the stack
    Allow to distribute work loads
    Require maintenance/monitoring
    Harder to debug (more when multi-worker)
    More services but smaller (micro-services)

    View Slide

  28. Thoughts and conclusions
    Celery:
    Is the framework for distributed systems
    Is the framework that each Pythonista should test
    when play with DS.
    Is a mature projects with good support.
    Has a good documentation.
    Is simple to configure and run.
    Is a WORLD to learn and understand in deep.
    Could be extended “easily”.
    (signals, management commands, remotes).
    Has LOTS of settings and features
    Should be monitored as a normal service.
    Something that I do not know… =)

    View Slide

  29. Extra: Distributed Locking REDIS
    https://gist.github.com/malderete/449cf92a16983c0bb412

    View Slide

  30. Extra: Distributed Locking (Django’s cache)

    View Slide

  31. Muito Obrigado
    ¿Perguntas?
    Martin Alderete
    @alderetemartin
    [email protected]

    View Slide

  32. Muito Obrigado
    ¿Perguntas?
    while not manos.adormecida:
    aplaudir()
    Martin Alderete
    @alderetemartin
    [email protected]

    View Slide

  33. http://www.celeryproject.org/
    http://redis.io/
    http://www.rabbitmq.com/
    http://www.amqp.org/
    http://zookeeper.apache.org/
    https://github.com/brolewis/celery_mutex
    https://gist.github.com/malderete
    Links....

    View Slide