Slide 1

Slide 1 text

BBR: Congestion-Based Congestion Control Review of the paper ”Bbr: Congestion-Based Congestion Control” by Hannes Frederic Sowa Papers We Love – NYC 2017

Slide 2

Slide 2 text

Outline ● Motivation ● Context – Historical review of congestion control – Overview of in-use congestion control algorithms ● The actual paper at hand ● Some command line commands to feel the paper ● Conclusion and outlook

Slide 3

Slide 3 text

Motivation ● For networking people maybe one of the more exciting papers this decade – TCP is one of the most used protocols – Even very small improvements to the protocol pay out a lot – Especially if they are noticeable by a lot of people ● No advancements in the state of the art of congestion control ● Thus no new approaches deployed since a while

Slide 4

Slide 4 text

Congestive Collapse ● Happened 1986 in the NSFNet – Backbone speed dropped from 32kbit/s to 40bit/s ● Poor retransmission behavior of early implementations – Network was stuffed with retransmits ● First implementation of congestion control – Designed and implemented by Van Jacobson – Deployed up until 1988

Slide 5

Slide 5 text

Early congestion control ● Congestion Avoidance and Control – Van Jacobson and Michael Karels 1988 ● Congestion detection based on packet loss – Sender would slow down sending rate due to detection of packet loss – Detection loss happens based on Retransmit Timeout (RTO) or duplicated ACK packets ● Based on AIMD: additive increase multiplicative decrease

Slide 6

Slide 6 text

Buffers vs. congestion control ● Visible to users: buffers increase latency of networking operations if filled – and don’t drop packets → invisible to loss based congestion control ● Huge effect on loss based congestion control – If buffers are large, available bandwidth becomes hard to discover for loss based congestion control ● Tends to keep the buffers filled up and thus creates bufferbloat ● Dramatically increases latency – If buffers are small and packets get dropped early, data streams slow down ● It could just have been a short burst long lasting congestion

Slide 7

Slide 7 text

Current congestion control algorithms ● CUBIC – Linux, Mac OS and (soon) Windows default congestion algorithm ● Loss based ● Cubic function ● Compound-TCP – Current Windows congestion control algorithm ● Hybrid (TCP-Reno + delay based CC) ● LEDBAT – Apple and Bittorrent – Delay / ticked based

Slide 8

Slide 8 text

Latency based congestion control ● Alternative to loss based congestion control ● Observes variation of round trip time and estimates congestion ● Unfortunately: will always be defeated by loss based congestion control – Thus nearly not used – (again, Microsoft uses a hybrid loss/latency based congestion control depending on RTT as well as does Apple for Updates)

Slide 9

Slide 9 text

BBR: Congestion-Based Congestion Control ● The paper at hand ● Bottleneck-Bandwidth-and-Round-trip propagation time ● Developed by Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, Van Jacobson ● Developed and upstreamed into the Linux kernel with additional help of Eric Dumazet, Nandita Dukkipati by end of 2016 – With further enhancements along the way

Slide 10

Slide 10 text

Analogy physical pipe BtlBw RtProp Intermediate devices hidden to TCP – single pipe analogy. Slowest link determines overall throughput: RtProp: round-trip propagation time BtlBw: bottleneck bandwidth especially the minimal diameter of the pipe inflight = BtlBw · RtProp (Bandwidth-delay product: bits/s * s = bits – maximum amount of data in the network) Queue forms at device with slowest link

Slide 11

Slide 11 text

Source: https://cacm.acm.org/magazines/2017/2/212428-bbr-congestion-based-congestion-control/fulltext

Slide 12

Slide 12 text

Filling up the pipe, just not overwhelming it ● Optimal operating point bottleneck packet arrival rate == BtlBw (rate balance equation) and total data in flight == BtlBw · Rtprop (full pipe equation) ● Thus simply measure both and send data accordingly ● But not so fast...

Slide 13

Slide 13 text

Naive approach to measure Rtprop and BtlBw ^ RTprop=RTprop+min(ηt )=min(RTT t )∀ t∈[T−W R ,T ] ^ BtlBw=max(deliveryRate t )∀t∈[T−W B ,T ] deliveryRate=ΔdeliveryRate/Δt RTprop approximation BtlBw approximation: where ΔdeliveryRate can be inferred by receiving ACKs from the sender side. They announce that data has left the pipe. For Δt the stack needs to keep track.

Slide 14

Slide 14 text

The problem with estimating RTprop and BtlBw ● Sending at steady state at optimal point doesn’t form queues but also doesn’t eliminate them ● Routing changes packet travel paths, thus also changes RTprop or BtlBw ● Furthermore: If RTprop can be observed then BtlBw cannot ● and vice versa

Slide 15

Slide 15 text

When an ACK packet is received ● ACK arrival provides RTT and delivery rate estimates ● Always update RTT estimates ● Update deliveryRate estimates: – If ( max_BtlBw < deliveryRate) then Update BtlBw max estimates – If ( packet not app-limited) then Update BtlBw max estimates

Slide 16

Slide 16 text

When data is sent ● Update packet state – Timestamp – Mark packets if application limited (not consider for BtlBw estimates) ● Packet pacing – adapt sending rate to BtlBw (smoothens bursts): pacing_rate ● Preferred way is installing fair queue scheduler on interface ● Since Linux v4.13 TCP internal pacinig available

Slide 17

Slide 17 text

Steady-state behavior ● BBR is a sender-side only congestion control algorithm ● Takes BtlBw and RTprop as input and controls adaption and estimation of bottleneck constraints →control loop ● Probing phase: – cycle pacing_gain to probe for bandwidth and RTT ● Remember: pacing happens at bottleneck speed rate – Apply new measurements as soon as possible – Decrease gain (< 1) to eliminate possible build up queues

Slide 18

Slide 18 text

Source: https://cacm.acm.org/magazines/2017/2/212428-bbr-congestion-based-congestion-control/fulltext

Slide 19

Slide 19 text

Source: https://cacm.acm.org/magazines/2017/2/212428-bbr-congestion-based-congestion-control/fulltext

Slide 20

Slide 20 text

Ramping up the connection without leaving queues behinds ● How to reach steady-state behavior and bandwidth probing? ● Startup State: – Binary search of BtlBw with pacing_gain of 2/ln2 – Finds BtlBw in log 2 (BDP) RTT – But leaves 2 * BDP in the queues ● Thus, after start-up state, BBR enters drain state: – Uses gain of ln2/2 (inverse of above) – Empties queues until inflight drops to BPD ● BBR enters steady-state and begins probing ● Hopefully without queues!

Slide 21

Slide 21 text

Source: https://cacm.acm.org/magazines/2017/2/212428-bbr-congestion-based-congestion-control/fulltext

Slide 22

Slide 22 text

Sharing a pipe ● ProbeRTT state is entered when RTT filter expires – That is when RTprop hasn’t been updated by many seconds with a lower RTT ● Reduces inflight to 4 * maximum segment size for at least one round trip ● When streams go into ProbeRTT state, they lower RTT for all flows on the system – Thus the last timestamp when RTT was last updated is shared between all flows – They tend to go into ProbeRTT state at the same time – This repeats and repeats, bringing RTT measurements closer to its physical value ● BBR achieved synchronization between connections

Slide 23

Slide 23 text

Source: https://cacm.acm.org/magazines/2017/2/212428-bbr-congestion-based-congestion-control/fulltext

Slide 24

Slide 24 text

Status of Deployment ● Google widely deployed TCP-BBR inside their datacenters – Huge speed up especially for intercontinental links (high BDP) ● Google’s outbound facing servers started to use BBR – e.g. youtube paces packets now very regularly, no bursts visible anymore – Improves latency in a lot of networks, especially with huge buffers

Slide 25

Slide 25 text

“Imperfections” ● BBR causes high packet loss with competing BBR streams and small buffers along the path – BBRv2 might actually react to discovered packet loss ● Ramp-Up phase sometimes gives some flows unfair advantages ● Issues with middle boxes – Stretched / delayed ACKs – Policing systems can trick BBR

Slide 26

Slide 26 text

Practical uses ● Use a recent Linux kernel ● Optional but recommended: add pacer to the outgoing interface – # tc qdisc replace dev root fq – $ man tc-fq ● Enable the usage of bbr by default – # modprobe tcp_bbr – # sysctl -w net.ipv4.tcp_congestion_control=bbr

Slide 27

Slide 27 text

Conclusion and future ● BBR is a new idea of how to do congestion control – Still in development (see recent updates from IETF100) – Tries to adapts sending speed to sweet spot ● Major new protocols on the horizon – TLSv1.3 – HTTP/2 ● QUIC as alternative to TCP? – QUIC inherits the same problems from TCP – Implemented in user space – Allows for more complicated algorithms (e.g. FPU, databases, A.I.)

Slide 28

Slide 28 text

Thanks to ● The authors of this paper ● The netdev@ community and all the people trying to free the Internet of bufferbloat ● backtrace.io for inviting me to stay in New York for a month – (shameless plug: we are hiring)