$30 off During Our Annual Pro Sale. View Details »

Danian: tail latency reduction of networking application through an O(1) scheduler

Gustavo Pantuza
September 01, 2021

Danian: tail latency reduction of networking application through an O(1) scheduler

Paper presented at: ISCC 2021

Abstract:
Core allocation for application threads is a problem of reasonable complexity and computational cost inside Unix systems. Caladan scheduler is a solution aiming to reduce the cost of how threads and cores are allocated in microsecond scale. Danian system optimizes through memoization the thread picking algorithm that picks the best thread for a given core. Such improvements have direct impact on applications distributed across networks on a data center. Thread picking operation cost dropped from O(n) to O(1), the CPU time reduced 7%, the tail latency reduced 3% on Caladan Synthetic experiment and 5% on the Netperf experiment.

Key words:
Real Time Communication Services,
Distributed Systems Architecture and Management,
Optimization and Management,
Network Reliability,
Network Design

Gustavo Pantuza

September 01, 2021
Tweet

More Decks by Gustavo Pantuza

Other Decks in Research

Transcript

  1. Danian:
    Tail latency reduction of networking
    application through an O(1) scheduler
    Gustavo Pantuza, Lucas A. C. Bleme, Marcos Augusto M. Vieira, Luiz Filipe M. Vieira
    26th IEEE Symposium on Computers and Communications
    Athens, Greece, September 5-8, 2021
    IEEE ISCC 2021

    View Slide

  2. Agenda
    ■ Introduction
    ■ Thread scheduling
    ■ Caladan
    ■ Danian
    ■ Experiments
    ■ Results
    ■ Future work
    ■ Conclusion

    View Slide

  3. Introduction
    ■ Tail at scale (2013)
    ■ Shenango (2019)
    ■ Caladan (2020)
    ■ Danian (2021)
    p50 p95 p99
    1ms 5ms 10ms
    Hypothetical Example

    View Slide

  4. Thread scheduling
    ■ Lottery (1994)
    ■ Scheduler Activation (1991)
    ■ Caladan (2020)

    View Slide

  5. Caladan
    ■ Schedule threads into CPUs
    ■ Run on top of DPDK
    ■ Reads control signals every 5 μs
    ■ Implemented inside Shenango

    View Slide

  6. Caladan
    Simplified version of Caladan architecture inspired by the Caladan original paper architecture description

    View Slide

  7. Danian
    “In the 5000 years between the events of the
    Arrakis Revolt and the time the Lost Ones
    returned from The Scattering, Caladan's
    name was shortened to Dan, and all things
    pertaining to Dan were known as Danian.”
    Fonte: https://dune.fandom.com/wiki/Caladan

    View Slide

  8. Danian
    Fonte: https://dune.fandom.com/wiki/Caladan
    ■ Works inside Caladan ksched
    ■ Adds a memoization array
    ■ Intercepts threads join/leave
    ■ Algorithm to assign CPU→thread
    ■ O(n) → O(1)

    View Slide

  9. Danian
    struct proc {
    pid_t pid;
    ...
    struct thread *last_run[NCPU];
    ...
    }

    View Slide

  10. Danian
    static struct thread *
    sched_pick_last_kthread(struct proc *p, unsigned int core)
    {
    struct thread *th;
    th = p->last_run[core];
    if (!th->active) {
    return th;
    }
    return list_tail(&p->idle_threads, struct thread, idle_link);
    }

    View Slide

  11. Experiments
    ■ CloudLab
    ■ Client/Server
    ■ Latency percentiles
    ■ Varying number of threads
    ■ CPU usage
    ■ Netperf

    View Slide

  12. Experiments

    View Slide

  13. Results

    View Slide

  14. Results

    View Slide

  15. Results

    View Slide

  16. Results

    View Slide

  17. Results

    View Slide

  18. Results

    View Slide

  19. Future work
    Fonte: https://dune.fandom.com/wiki/Caladan
    ■ NFVs with lthreads inside Caladan
    ■ SDN using Caladan as control plane

    View Slide

  20. Conclusion
    ■ Thread picking from O(n) to O(1)
    ■ Memoization using LRU policy
    ■ -5% on tail latency (p99)
    ■ -15% CPU usage

    View Slide

  21. Danian:
    Tail latency reduction of networking
    application through an O(1) scheduler
    Gustavo Pantuza, Lucas A. C. Bleme, Marcos Augusto M. Vieira, Luiz Filipe M. Vieira
    26th IEEE Symposium on Computers and Communications
    Athens, Greece, September 5-8, 2021
    IEEE ISCC 2021

    View Slide