Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scheduling Async Tasks with Python Celery

Rain
March 14, 2021

Scheduling Async Tasks with Python Celery

- Introduction to Asynchronous (Async) Tasks
- How Celery Help Us Manage Async Tasks
- Deep Dive into Async Tasks Management in a Distributed System

Rain

March 14, 2021
Tweet

More Decks by Rain

Other Decks in Technology

Transcript

  1. Agenda Introduction to Asynchronous (Async) Tasks How Celery Help Us

    Manage Async Tasks Deep Dive into Async Tasks Management in a Distributed System
  2. Introduction to Asynchronous Tasks Synchronous v.s. Asynchronous Why do we

    need asynchronous? Drawbacks of Asynchronous Tasks
  3. Asynchronous System design Implement Test Review the PRs Write the

    documents Chat with your colleague Software Engineer Save the draft Ctrl + S Manual design completed
  4. Asynchronous Acknowledge Study System design Implement Test Deliver Save the

    draft Commit the code Build the artifact Say ok to the PM Store knowledge in mind
  5. Youtube-liked Service Receive video data Respond to client Segmentation Encoding

    & Compression Post-processing Store playable objects
  6. Receive video data Respond to client Segmentation Encoding & Compression

    Post-processing Store playable objects Youtube-liked Service
  7. Receive video data Respond to client Segmentation Encoding & Compression

    Post-processing Store playable objects Store the raw data Youtube-liked Service
  8. Asynchronous Receive the request Store the Result Reply the response

    Save the task context Fetch the task and execute Retry if task failed
  9. Developers’ Perspective Client paid NTD$123 to our account Ok, I’ll

    do it later How is it going now? It’s failed lol~ F**k Today Yesterday
  10. How Celery Help Us Manage Async Tasks Architecture Overview Implement

    the Configuration and the Tasks A Practical Demo
  11. Python Celery Celery is a simple, flexible, and reliable distributed

    system to process vast amounts of messages, while providing operations with the tools required to maintain such a system. It’s a task queue with focus on real-time processing, while also supporting task scheduling.
  12. Request Response Message Broker Server Workers Save the task Fetch

    the task Result Backend Fetch the result Save the result
  13. Deep Dive into Async Tasks Management in a Distributed System

    Load balancer v.s. message broker Duplicated message Mingle and gossip Thundering herds
  14. Load Balancer Load Balancer Servers Load Balancer dispatch workloads to

    Servers Focus on synchronous, run scheduling algorithm
  15. Message Broker Message Broker Workers Servers retrieve workloads from Load

    Balancer Focus on asynchronous, not actively schedule tasks
  16. Highest cost and worst performance because it needs complicated mechanism

    to ensure the delivery status (e.g. Pessimistic lock) Communication in Distributed System Exactly once delivery At most once delivery At least once delivery
  17. Highest performance but messages may be lost, not suitable for

    critical tasks. (Similar to UDP protocol) Communication in Distributed System Exactly once delivery At most once delivery At least once delivery
  18. Strike a balance between reliable and performance, but messages may

    be duplicated. Communication in Distributed System Exactly once delivery At most once delivery At least once delivery
  19. Message Broker Message Broker Worker 1 2 3 1. Retrieve

    queue for new workload 2. Dispatch message to worker 3. Acknowledge to broker 4. Broker mark the state of message with “received”
  20. Message Broker Message Broker Worker 1 1 2 3 1.

    Worker 1 retrieve queue for new workload 2. Broker dispatch message to worker and mark 
 it “dispatched” 3. Worker 1 acknowledge to broker, but a 
 network error occurs Worker 2
  21. Message Broker Message Broker Worker 1 1 1. Broker does

    not receive acknowledgement 
 until timeout, so the message was revoked Worker 2
  22. Message Broker Message Broker Worker 1 1 2 3 Worker

    2 1. Worker 2 retrieve queue for new workload 2. Broker dispatch message to worker and mark 
 it “dispatched” 3. Worker 2 acknowledge to broker 4. Broker mark the state of message with “received”
  23. Mingle & Gossip Message Broker Workers Broadcast vast of event

    for synchronization - Revoked tasks (Mingle) - Heartbeat (Gossip) - Logical clock (Both)
  24. Mingle & Gossip Message Broker Workers All of the event

    will transmit through the message broker 危
  25. I will suggest you turn off gossip but keep mingle

    alive since it can avoid duplicated messages via ask revoked tasks
  26. And do not make a service serving as broker and

    result backend at the same time
  27. Thundering Herds Time Task 1~10 start Task 1 
 lock

    DB Task 1 finish Task 1 
 unlock DB Task 2~10 failed to access DB, 
 schedule for retry after 30 sec 00:00 00:04 00:10 00:06
  28. Thundering Herds Time Task 2~10 start Task 2 
 lock

    DB Task 2 finish Task 2 
 unlock DB Task 3~10 failed to access DB, 
 schedule for retry after 30 sec 00:34 00:38 00:44 00:40