A Fair Task Execution Framework

C5686e8241d39d963c175bb1738295d0?s=47 Mourjo Sen
January 12, 2019

A Fair Task Execution Framework

How to ensure that in a multi-tenant business on the same infrastructure, one big customer will not bottleneck every single maintenance task you have to perform on all customers’ data? With the help of Clojure’s concurrency and core.async, the answer to this question was intuitive and effective.

C5686e8241d39d963c175bb1738295d0?s=128

Mourjo Sen

January 12, 2019
Tweet

Transcript

  1. 3.

    Running jobs on customer data • Maintenance is unavoidable ◦

    Feature to release: Segment conversations into buckets ◦ Every conversation must be part of one bucket ◦ Need to add a “default bucket” to all conversation before release
  2. 4.

    Running jobs on customer data • Maintenance is unavoidable ◦

    Feature to release: Segment conversations into buckets ◦ Every conversation must be part of one bucket ◦ Need to add a “default bucket” to all conversation before release • Uneven data distribution
  3. 5.

    Goals Fair 1. Runtime independent of data distribution 2. Only

    dependent on amount of data Easy To understand and use
  4. 6.

    Strategy 1: Spawn a thread for each customer Run each

    customer on a dedicated thread in a threadpool
  5. 7.

    Strategy 1: Spawn a thread for each customer Run each

    customer on a dedicated thread in a threadpool
  6. 8.

    Strategy 1: Report Card Goal Result Fair No Easy Yes

    Easy to use but runtime affected by long running jobs
  7. 9.

    Strategy 2: Chunk-and-work Decouple the size of customers from the

    unit of concurrency. Each task is a “small-enough” segment of a customer
  8. 10.

    Strategy 2: Chunk-and-work Decouple the size of customers from the

    unit of concurrency. Each task is a “small-enough” segment of a customer [{:cust-id “xyz” :month “Jan”} {:cust-id “xyz” :month “Feb”} ... {:cust-id “abc” :month “Jan”} {:cust-id “abc” :month “Feb”} ...]
  9. 13.

    Strategy 2: Chunk-and-work Some timeranges contain more data than others

    Some timeranges contain too little or no data
  10. 15.

    Strategy 2: Report Card Goal Result Fair Mostly Easy Yes

    Uneven data distribution across customers => Handled Uneven division of tasks => Unhandled
  11. 16.

    Strategy 2: Report Card Goal Result Fair Mostly Easy Yes

    Robust No Uneven data distribution across customers => Handled Uneven division of tasks => Unhandled
  12. 21.

    Strategy 3: Iterator and Worker [{:cust-id “xyz” :month “Jan”} {:cust-id

    “xyz” :month “Feb”} ... {:cust-id “abc” :month “Jan”} {:cust-id “abc” :month “Feb”} ...]
  13. 22.

    Strategy 3: Iterator and Worker Dynamically chunk into uniformly sized

    tasks [{:cust-id “xyz” :month “Jan”} {:cust-id “xyz” :month “Feb”} ... {:cust-id “abc” :month “Jan”} {:cust-id “abc” :month “Feb”} ...] Give me the next 10K records where id > last-id in this timerange
  14. 23.

    Strategy 3: Iterator and Worker On-the-fly chunked tasks Dynamically chunk

    into uniformly sized tasks [{:cust-id “xyz” :month “Jan”} {:cust-id “xyz” :month “Feb”} ... {:cust-id “abc” :month “Jan”} {:cust-id “abc” :month “Feb”} ...] [data-id1, data-id2, ...] Give me the next 10K records where id > last-id in this timerange
  15. 24.

    Strategy 3: Iterator and Worker On-the-fly chunked tasks Dynamically chunk

    into uniformly sized tasks [{:cust-id “xyz” :month “Jan”} {:cust-id “xyz” :month “Feb”} ... {:cust-id “abc” :month “Jan”} {:cust-id “abc” :month “Feb”} ...] [data-id1, data-id2, ...] Give me the next 10K records where id > last-id in this timerange Perform the task
  16. 25.

    Strategy 3: Iterator and Worker On-the-fly chunked tasks Dynamically chunk

    into uniformly sized tasks [{:cust-id “xyz” :month “Jan”} {:cust-id “xyz” :month “Feb”} ... {:cust-id “abc” :month “Jan”} {:cust-id “abc” :month “Feb”} ...] [data-id1, data-id2, ...] Give me the next 10K records where id > last-id in this timerange Perform the task
  17. 28.

    Strategy 3: Iterator and Worker (fair-exec-job (let [ch (async/chan 1000)]

    {:iterator (partial iterate-customer ch) :input-tasks customer-xs-or-ch :chan ch :iterator-thread-count 10 :worker run-tasks :worker-thread-count 20})) Launcher
  18. 29.

    Strategy 3: Iterator and Worker (defn iterate-customer [output-chan {:keys [customer

    query profiles-per-task] :as params}] (let [data (query-next-batch query) num (count data) last-id (:id (last data))] (doseq [chunk (partition-all 10000 data)] (async/>!! output-chan {:customer customer :chunk chunk})) (when (= num-entities query-limit) (recur (assoc params :query {:id {:$gt last-id}}))))) (fair-exec-job (let [ch (async/chan 1000)] {:iterator (partial iterate-customer ch) :input-tasks customer-xs-or-ch :chan ch :iterator-thread-count 10 :worker run-tasks :worker-thread-count 20})) Iterator Launcher
  19. 30.

    Strategy 3: Iterator and Worker (defn iterate-customer [output-chan {:keys [customer

    query profiles-per-task] :as params}] (let [data (query-next-batch query) num (count data) last-id (:id (last data))] (doseq [chunk (partition-all 10000 data)] (async/>!! output-chan {:customer customer :chunk chunk})) (when (= num-entities query-limit) (recur (assoc params :query {:id {:$gt last-id}}))))) (defn run-tasks [{:keys [customer chunk] :as task}] (run-maintenance-on customer chunk)) (fair-exec-job (let [ch (async/chan 1000)] {:iterator (partial iterate-customer ch) :input-tasks customer-xs-or-ch :chan ch :iterator-thread-count 10 :worker run-tasks :worker-thread-count 20})) Iterator Worker Launcher
  20. 34.

    Conclusion • Concurrency => Parallelism • A good concurrent design

    => Tunable throughput • Clojure => Easy to harness the power of concurrency