Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains

9fa56d41ed10a6ad67ff80c9e7626eb3?s=47 JackKuo
August 13, 2020

NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains

Group meeting presentation of CANLAB in NTHU

9fa56d41ed10a6ad67ff80c9e7626eb3?s=128

JackKuo

August 13, 2020
Tweet

Transcript

  1. NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains Speaker

    : Chun-Fu Kuo Date : 2020/08/13 Sameer G Kulkarni, Wei Zhang, Jinho Hwang, Shriram Rajagopalan K.K. Ramakrishnan, Timothy Wood, Mayutan Arumaithurai and Xiaoming Fu 2017 SIGCOMM 1 Communications and Networking Lab, NTHU
  2. Outline n Introduction n System Model n Problem Formulation n

    Proposed Method n Evaluation n Conclusion n Pros and Cons Communications and Networking Lab, NTHU 2
  3. Introduction Communications and Networking Lab, NTHU 3 Nice (Unix Program)

    • Is used to invoke a program with a particular CPU priority • Show niceness value in `top` command • Range from -20 (highest priority) to 19 (lowest priority) • Priority (share of the CPU time): 20 − • Max priority ratio: !" # #!" !" #$% = 40
  4. Introduction Communications and Networking Lab, NTHU 4 Priority (Unix) •

    Range from -100 (highest priority) to 40 (lowest priority) • Calculation • Normal process: PR = 20 + NI (ranges from -20 to 19) • Real time process: PR = -1 - real_time_priority (ranges from 1 to 99) • RT means PR = -100
  5. Introduction Communications and Networking Lab, NTHU 5 Cgroups (Linux) •

    Abbreviated from “control groups” • A Linux kernel feature • Limits, accounts for, and isolates the resource usage (CPU, RAM, …)
  6. Introduction Communications and Networking Lab, NTHU 6 CPU Shares •

    A CPU subsystem of cgroups • A relative amount of CPU time for task to run CPU Quota • A CPU subsystem of cgroups • To enforce a hard limit to the CPU time allocated to processes CPUSET • A CPU subsystem of cgroups • Limit the specific CPUs or cores can use
  7. Introduction Communications and Networking Lab, NTHU 7 Common Schedulers •

    CFS • A default scheduler since Linux 2.6.23 • Implemented by red-black tree • Based on process’s cumulative run-time • Timeslice is not fixed • CFS batch • Similar to CFS • Longer time quantum (fewer context switch) • Round robin • Cycle through each process • Default 100ms in this paper’s evaluation
  8. Introduction Communications and Networking Lab, NTHU 8 ECN, ECE, CWR

    • ECN (Explicit Congestion Notification) ECN Congestion ECE ECE ECE CWR CWR CWR
  9. System Model Communications and Networking Lab, NTHU 9

  10. Problem Formulation Communications and Networking Lab, NTHU 10 • Throughput

    should be equal for 2 VNF chain if arrival rate are equal • Even if chain A has 2x processing cost to chain B • Current Linux CFS cannot achieve it • It doesn’t know the state of each NFs • Unfair scheduling causes processing waste (drop too late) 30 Mbps 20 Mbps 40 Mbps Drop: 10 Mbps 30 Mbps 30 Mbps 20 Mbps
  11. Problem Formulation Communications and Networking Lab, NTHU 11 Simple Test

    on 3 Schedulers • Environment • Only single core • DPDK-based NFV platform • Load • Even load • Each NF has 5 Mpps • Uneven load • NF1: 6 Mpps • NF2: 6 Mpps • NF3: 3 Mpps
  12. Problem Formulation Communications and Networking Lab, NTHU 12 Simple Test

    on 3 Schedulers: Result for Homogeneous • All 3 NF have equal computation cost (roughly 250 CPU cycles)
  13. Problem Formulation Communications and Networking Lab, NTHU 13 Simple Test

    on 3 Schedulers: Result for Homogeneous • All 3 NF have equal computation cost (roughly 250 CPU cycles)
  14. Problem Formulation Communications and Networking Lab, NTHU 14 Simple Test

    on 3 Schedulers: Result for Heterogeneous • NF1→ 500, NF2→250, NF3→50 CPU cycles
  15. Problem Formulation Communications and Networking Lab, NTHU 15 Simple Test

    on 3 Schedulers: Result for Heterogeneous • NF1→ 500, NF2→250, NF3→50 CPU cycles
  16. Proposed Method Communications and Networking Lab, NTHU 16 • NFVnice

    • User space’s NF scheduler • Service chain management framework • Scheduler-agnostic • Features • Base on arrival rate, processing cost: auto tuning CPU parameters • Service chain level backpressure (congestion control)
  17. Proposed Method Communications and Networking Lab, NTHU 17 • NFVnice

    can • Monitor average computation time of NFs per packet • Monitor queue size of NF • Monitor I/O activities • NF state in chain (overloaded, block on I/O) • Libnf • A library support this framework • Can • Efficient reading/writing packets • Overlapping processing with non-blocking async I/O • Schedule/deschedule NFs
  18. Proposed Method Communications and Networking Lab, NTHU 18 Scheduling NFs:

    Activating NFs • Why • NF is always busy wait (poll mode) • Waste CPU • Previous work • ClickOS, netmap • But both of them are too simple, only on and off • In NFVnice • NFs sleep by blocking on a semaphore (share with NF Manager) • According to NF queue, downstream’s queue • So, no need to provide information to OS scheduler
  19. Proposed Method Communications and Networking Lab, NTHU 19 Scheduling NFs:

    Relinquishing the CPU • Whether to process next batch of packet • NF calls libnf for decision • Libnf checks the flag set in shared memory by NF manager • If true: block on the semaphore until notified by the manager • This provides a flexible way to ask NF to give up CPU
  20. Proposed Method Communications and Networking Lab, NTHU 20 Scheduling NFs:

    CPU Scheduler • Multiple NF processes are likely to be runnable • Scheduler has to: • Determine which process to run, for how long • But the cost of sync information to kernel is high • NFVnice carefully: • Control NFs (including yield) • CFS batch is good: • Long running time • Less frequent preemption
  21. Proposed Method Communications and Networking Lab, NTHU 21 Scheduling NFs:

    Assigning CPU Weight • NFVnice estimates the requirement of CPU in real time • To avoid outliers from skewing these measurement: • A histogram of timings is maintained • For each NF on shared core : • = ∑& '($ () • = ' ∗ ' • : arrival rate • : service time • ℎ' = ' ∗ )*+,(') /*0+)1*+,(2)
  22. Proposed Method Communications and Networking Lab, NTHU 22 Backpressure: Cross-Chain

    Pressure • When NF Manager’s TX thread detect: 1. Receive queue length for an NF > HIGH_WATER_MARK 2. Queuing time > threshold • Then • Determine which flows are with whole packet of that queue • Drop at the upstream NFs • When • Queue length < LOW_WATER_MARK • Enable flows at the upstream
  23. Proposed Method Communications and Networking Lab, NTHU 23 Backpressure: Local

    Optimization and ECN • NFVnice provides simple local backpressure • When output TX queue becomes full à block • Use case: • Downstream NFs are slow • NF Manager TX thread is overloaded • NF-driven, not via Manager • Mark ECN bit in TCP flow • Facilitate end-to-end management
  24. Proposed Method Communications and Networking Lab, NTHU 24 Facilitating I/O:

    • The reasons of blocked NF: • Ring buffer is empty • Wait to complete I/O requests • Make use of async I/O
  25. Proposed Method Communications and Networking Lab, NTHU 25 Facilitating I/O:

    If receive ring buffer is empty, libnf notifies the NF Manager to block this NF a s y n c
  26. Proposed Method Communications and Networking Lab, NTHU 26 Optimizations: Separating

    Overload Detection and Control • Due to NFV platform process millions of packets per second • Separate out overload detection from the control mechanism • NF Manager’s TX thread enqueues a packet to NF’s RX queue • Only when the queue is < HIGH_WATER_MARK • Get return value about state of the queue (write to NF’s meta data) • NF Manager’s Wakeup thread • Scan all NFs and classify them into 2 categories: 1. Backpressure should apply 2. Need to be woken up • Provide some hysteresis control
  27. Proposed Method Communications and Networking Lab, NTHU 27 Optimizations: Separating

    Load Estimation and CPU Allocation • It’s critical that modifying `sysfs` (cgroup) should be done outside of the packet processing data path • Data plane (libnf) samples the packet processing time per 1ms (lightweight) • Observe the CPU cycle counter before & after the NF’s packet handler function • Store in a histogram in shared memory • NF Manager allots CPU shares • Use the median over 100ms moving window • Update the weight every 10ms
  28. Evaluation Communications and Networking Lab, NTHU 28 Environment: • CPU:

    E5-2697 • RAM: 157 GB • OS: Ubuntu (Linux 3.19.0-39-lowlatency) • Back-to-back dual port 10 Gbps DPDK compatible NICs • Prevent switch overhead • Scheduler • Round Robin (RR) • SCHED_NORMAL (termed NORMAL henceforth) • SCHED_BATCH (termed BATCH) • Traffic generator • Moogen • Pktgen • Iperf3
  29. Evaluation Communications and Networking Lab, NTHU 29 Performance: NF Service

    Chain on a Single Core Service chain with 3 NFs: • NF1: 120 CPU cycles • NF2: 270 CPU cycles • NF3: 550 CPU cycles
  30. Evaluation Communications and Networking Lab, NTHU 30 Performance: NF Service

    Chain on a Single Core Service chain with 3 NFs: • NF1: 120 CPU cycles • NF2: 270 CPU cycles • NF3: 550 CPU cycles
  31. Evaluation Communications and Networking Lab, NTHU 31 Performance: NF Service

    Chain on a Single Core Service chain with 3 NFs: • NF1: 120 CPU cycles • NF2: 270 CPU cycles • NF3: 550 CPU cycles Higher than default
  32. Evaluation Communications and Networking Lab, NTHU 32 Performance: Multi-core Scalability

    (1) Service chain with 3 NFs: pinned to 3 cores separately • NF1: 550 CPU cycles • NF2: 2200 CPU cycles • NF3: 4500 CPU cycles
  33. Evaluation Communications and Networking Lab, NTHU 33 Performance: Multi-core Scalability

    (2) Chain 1 NF1 (270 cycles) NF2 (120 cycles) NF4 (300 cycles) Chain 2 NF1 (270 cycles) NF3 (4500 cycles) NF4 (300 cycles) 2 Chain on 4 cores
  34. Evaluation Communications and Networking Lab, NTHU 34 Performance: Multi-core Scalability

    (2) Chain 1 NF1 (270 cycles) NF2 (120 cycles) NF4 (300 cycles) Chain 2 NF1 (270 cycles) NF3 (4500 cycles) NF4 (300 cycles)
  35. Evaluation Communications and Networking Lab, NTHU 35 Salient Features: Variable

    NF Packet Processing Cost A Service chain with 3 NFs: in varying costs (random) = 9 variants • Cost 1: 120 CPU cycles • Cost 2: 270 CPU cycles • Cost 3: 550 CPU cycles Single Core Current Previous
  36. Evaluation Communications and Networking Lab, NTHU 36 Salient Features: Service

    Chain Heterogeneity A Service chain with 3 NFs in varying costs (fixed)
  37. Evaluation Communications and Networking Lab, NTHU 37 Salient Features: Workload

    Heterogeneity A Service chain with 3 NFs in same costs Fully-connected, 3! = 6
  38. Evaluation Communications and Networking Lab, NTHU 38 Salient Features: Performance

    Isolation • When TCP flow share resources with UDP flow • TCP could have substantial performance degradation • Since UDP doesn’t have congestion control mechanism • Author said it is exacerbated in a software-based environment • Experiment • A 4 Gbps TCP flow through NF1, NF2 • 10 UDP flows through NF1~3, start at 15s, stop at 40s
  39. Evaluation Communications and Networking Lab, NTHU 39 Salient Features: Performance

    Isolation • When TCP flow share resources with UDP flow • TCP could have substantial performance degradation • Since UDP doesn’t have congestion control mechanism • Author said it is exacerbated in a software-based environment • Experiment • A 4 Gbps TCP flow through NF1, NF2 • 10 UDP flows through NF1~3, start at 15s, stop at 40s NF1 NF2 NF3 low medium high TCP 10 UDP
  40. Evaluation Communications and Networking Lab, NTHU 40 Salient Features: Performance

    Isolation
  41. Evaluation Communications and Networking Lab, NTHU 41 Salient Features: Efficient

    I/O Handling
  42. Evaluation Communications and Networking Lab, NTHU 42 Salient Features: Supporting

    Longer NF Chain SC: Single Core MC: Multi Core
  43. Conclusion n Goal n Throughput is proportional to arrival rate

    n Improve CPU utilization n Method n Schedule NFs according to their state n Backpressure n Results n NFVnice conquers most of scenarios Communications and Networking Lab, NTHU 43
  44. Pros & Cons n Pros n Efficient way with simple

    & existed methods n Lots of experiments in many scenarios n Cons n Some explanation is weird or lack Communications and Networking Lab, NTHU 44