Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hannes Frederic Sowa on "BBR: Congestion-Based Congestion Control"

Hannes Frederic Sowa on "BBR: Congestion-Based Congestion Control"

TCP congestion control has a large impact on perceived network performance (especially in terms of bandwidth and latency) and thus the Internet. Two major categories of congestion control algorithms had been explored, those using packet loss or packet delay feedback. Due to historic developments (and the development of packet switching hardware), packet-loss congestion control algorithms are commonly used today. We will discuss a congestion control scheme published by Google in 2017.

Papers_We_Love

January 31, 2018
Tweet

More Decks by Papers_We_Love

Other Decks in Technology

Transcript

  1. BBR: Congestion-Based Congestion Control Review of the paper ”Bbr: Congestion-Based

    Congestion Control” by Hannes Frederic Sowa <[email protected]> Papers We Love – NYC 2017
  2. Outline • Motivation • Context – Historical review of congestion

    control – Overview of in-use congestion control algorithms • The actual paper at hand • Some command line commands to feel the paper • Conclusion and outlook
  3. Motivation • For networking people maybe one of the more

    exciting papers this decade – TCP is one of the most used protocols – Even very small improvements to the protocol pay out a lot – Especially if they are noticeable by a lot of people • No advancements in the state of the art of congestion control • Thus no new approaches deployed since a while
  4. Congestive Collapse • Happened 1986 in the NSFNet – Backbone

    speed dropped from 32kbit/s to 40bit/s • Poor retransmission behavior of early implementations – Network was stuffed with retransmits • First implementation of congestion control – Designed and implemented by Van Jacobson – Deployed up until 1988
  5. Early congestion control • Congestion Avoidance and Control – Van

    Jacobson and Michael Karels 1988 • Congestion detection based on packet loss – Sender would slow down sending rate due to detection of packet loss – Detection loss happens based on Retransmit Timeout (RTO) or duplicated ACK packets • Based on AIMD: additive increase multiplicative decrease
  6. Buffers vs. congestion control • Visible to users: buffers increase

    latency of networking operations if filled – and don’t drop packets → invisible to loss based congestion control • Huge effect on loss based congestion control – If buffers are large, available bandwidth becomes hard to discover for loss based congestion control • Tends to keep the buffers filled up and thus creates bufferbloat • Dramatically increases latency – If buffers are small and packets get dropped early, data streams slow down • It could just have been a short burst long lasting congestion
  7. Current congestion control algorithms • CUBIC – Linux, Mac OS

    and (soon) Windows default congestion algorithm • Loss based • Cubic function • Compound-TCP – Current Windows congestion control algorithm • Hybrid (TCP-Reno + delay based CC) • LEDBAT – Apple and Bittorrent – Delay / ticked based
  8. Latency based congestion control • Alternative to loss based congestion

    control • Observes variation of round trip time and estimates congestion • Unfortunately: will always be defeated by loss based congestion control – Thus nearly not used – (again, Microsoft uses a hybrid loss/latency based congestion control depending on RTT as well as does Apple for Updates)
  9. BBR: Congestion-Based Congestion Control • The paper at hand •

    Bottleneck-Bandwidth-and-Round-trip propagation time • Developed by Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, Van Jacobson • Developed and upstreamed into the Linux kernel with additional help of Eric Dumazet, Nandita Dukkipati by end of 2016 – With further enhancements along the way
  10. Analogy physical pipe BtlBw RtProp Intermediate devices hidden to TCP

    – single pipe analogy. Slowest link determines overall throughput: RtProp: round-trip propagation time BtlBw: bottleneck bandwidth especially the minimal diameter of the pipe inflight = BtlBw · RtProp (Bandwidth-delay product: bits/s * s = bits – maximum amount of data in the network) Queue forms at device with slowest link
  11. Filling up the pipe, just not overwhelming it • Optimal

    operating point bottleneck packet arrival rate == BtlBw (rate balance equation) and total data in flight == BtlBw · Rtprop (full pipe equation) • Thus simply measure both and send data accordingly • But not so fast...
  12. Naive approach to measure Rtprop and BtlBw ^ RTprop=RTprop+min(ηt )=min(RTT

    t )∀ t∈[T−W R ,T ] ^ BtlBw=max(deliveryRate t )∀t∈[T−W B ,T ] deliveryRate=ΔdeliveryRate/Δt RTprop approximation BtlBw approximation: where ΔdeliveryRate can be inferred by receiving ACKs from the sender side. They announce that data has left the pipe. For Δt the stack needs to keep track.
  13. The problem with estimating RTprop and BtlBw • Sending at

    steady state at optimal point doesn’t form queues but also doesn’t eliminate them • Routing changes packet travel paths, thus also changes RTprop or BtlBw • Furthermore: If RTprop can be observed then BtlBw cannot • and vice versa
  14. When an ACK packet is received • ACK arrival provides

    RTT and delivery rate estimates • Always update RTT estimates • Update deliveryRate estimates: – If ( max_BtlBw < deliveryRate) then Update BtlBw max estimates – If ( packet not app-limited) then Update BtlBw max estimates
  15. When data is sent • Update packet state – Timestamp

    – Mark packets if application limited (not consider for BtlBw estimates) • Packet pacing – adapt sending rate to BtlBw (smoothens bursts): pacing_rate • Preferred way is installing fair queue scheduler on interface • Since Linux v4.13 TCP internal pacinig available
  16. Steady-state behavior • BBR is a sender-side only congestion control

    algorithm • Takes BtlBw and RTprop as input and controls adaption and estimation of bottleneck constraints →control loop • Probing phase: – cycle pacing_gain to probe for bandwidth and RTT • Remember: pacing happens at bottleneck speed rate – Apply new measurements as soon as possible – Decrease gain (< 1) to eliminate possible build up queues
  17. Ramping up the connection without leaving queues behinds • How

    to reach steady-state behavior and bandwidth probing? • Startup State: – Binary search of BtlBw with pacing_gain of 2/ln2 – Finds BtlBw in log 2 (BDP) RTT – But leaves 2 * BDP in the queues • Thus, after start-up state, BBR enters drain state: – Uses gain of ln2/2 (inverse of above) – Empties queues until inflight drops to BPD • BBR enters steady-state and begins probing • Hopefully without queues!
  18. Sharing a pipe • ProbeRTT state is entered when RTT

    filter expires – That is when RTprop hasn’t been updated by many seconds with a lower RTT • Reduces inflight to 4 * maximum segment size for at least one round trip • When streams go into ProbeRTT state, they lower RTT for all flows on the system – Thus the last timestamp when RTT was last updated is shared between all flows – They tend to go into ProbeRTT state at the same time – This repeats and repeats, bringing RTT measurements closer to its physical value • BBR achieved synchronization between connections
  19. Status of Deployment • Google widely deployed TCP-BBR inside their

    datacenters – Huge speed up especially for intercontinental links (high BDP) • Google’s outbound facing servers started to use BBR – e.g. youtube paces packets now very regularly, no bursts visible anymore – Improves latency in a lot of networks, especially with huge buffers
  20. “Imperfections” • BBR causes high packet loss with competing BBR

    streams and small buffers along the path – BBRv2 might actually react to discovered packet loss • Ramp-Up phase sometimes gives some flows unfair advantages • Issues with middle boxes – Stretched / delayed ACKs – Policing systems can trick BBR
  21. Practical uses • Use a recent Linux kernel • Optional

    but recommended: add pacer to the outgoing interface – # tc qdisc replace dev <interface> root fq – $ man tc-fq • Enable the usage of bbr by default – # modprobe tcp_bbr – # sysctl -w net.ipv4.tcp_congestion_control=bbr
  22. Conclusion and future • BBR is a new idea of

    how to do congestion control – Still in development (see recent updates from IETF100) – Tries to adapts sending speed to sweet spot • Major new protocols on the horizon – TLSv1.3 – HTTP/2 • QUIC as alternative to TCP? – QUIC inherits the same problems from TCP – Implemented in user space – Allows for more complicated algorithms (e.g. FPU, databases, A.I.)
  23. Thanks to • The authors of this paper • The

    netdev@ community and all the people trying to free the Internet of bufferbloat • backtrace.io for inviting me to stay in New York for a month – (shameless plug: we are hiring)