Slide 1

Slide 1 text

A Server-to-Server View of the Internet Bala Georgios Arthur Matthew KC

Slide 2

Slide 2 text

Content moved closer to end users to reduce latency. Connections from end users are terminated at CDN servers close to the end users. 2 End Users End Users Origin Servers CDN Servers Internet’s Core CDN Servers

Slide 3

Slide 3 text

Viewing the Internet’s core from the distributed measurement platform of a CDN. 3

Slide 4

Slide 4 text

Back-Office Web traffic accounts for a significant fraction of core Internet traffic — Pujol et al., IMC, Nov. 2014. End-user experience is at the mercy of the unreliable Internet and its middle-mile bottlenecks — T. Leighton, CACM, Vol. 52. No. 2, Feb. 2009. 4 End Users End Users Origin Servers CDN Servers Internet’s Core CDN Servers

Slide 5

Slide 5 text

0 50 100 150 200 250 300 350 400 Jan Feb Mar Apr Apr May Jun Jul RTT (in ms) IPv4 IPv6 A six-month timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP 5

Slide 6

Slide 6 text

0 50 100 150 200 250 300 350 400 Jan Feb Mar Apr Apr May Jun Jul RTT (in ms) IPv4 IPv6 A six-month timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP 6 1

Slide 7

Slide 7 text

0 50 100 150 200 250 300 350 400 Jan Feb Mar Apr Apr May Jun Jul RTT (in ms) IPv4 IPv6 A six-month timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP Level-shifts in RTTs over both IPv4 and IPv6 7 2 1

Slide 8

Slide 8 text

0 50 100 150 200 250 300 350 400 Jan Feb Mar Apr Apr May Jun Jul RTT (in ms) IPv4 IPv6 To what extent do changes in the AS path affect round- trip times? 8

Slide 9

Slide 9 text

50 100 150 200 250 300 350 03/26 03/27 03/28 03/29 03/30 03/31 04/01 04/02 RTT (in ms) IPv4 IPv6 Night Day A portion of the timeline of RTTs between servers in Honk Kong, HK and Tokyo, JP. Daily oscillations in RTT between the servers. 9

Slide 10

Slide 10 text

50 100 150 200 250 300 350 03/26 03/27 03/28 03/29 03/30 03/31 04/01 04/02 RTT (in ms) IPv4 IPv6 Night Day How common are periods of daily oscillation in RTT, and where do they occur? 10

Slide 11

Slide 11 text

0 50 100 150 200 250 300 350 400 Jan Feb Mar Apr Apr May Jun Jul RTT (in ms) IPv4 IPv6 What affects end-to-end RTTs more – routing or congestion? 11

Slide 12

Slide 12 text

0 50 100 150 200 250 300 350 400 Jan Feb Mar Apr Apr May Jun Jul RTT (in ms) IPv4 IPv6 How does IPv4 and IPv6 compare with respect to routing and performance? 12

Slide 13

Slide 13 text

1. To what extent do changes in the AS path affect round- trip times? 2. How common are periods of daily oscillation in RTT, and where do they occur? 13

Slide 14

Slide 14 text

Effect of routing changes on end-to- end RTTs 14

Slide 15

Slide 15 text

Data Set: Long Term • ≈600 dual-stacked servers in 70 different countries. ‣ US, AU, DE, IN , JP, … 15

Slide 16

Slide 16 text

time A B Traceroutes conducted between servers in both directions over both protocols. 16

Slide 17

Slide 17 text

A B time A B A B 3 3 Every 3 hours traceroutes done over the full-mesh. All traceroutes in a given 3 hour time frame have the same timestamp. 17

Slide 18

Slide 18 text

A-B Trace Timeline A B time A B A B A B A B A B Traceroutes over the full-mesh every 3 hours for 16 months from Jan. 2014 through Apr. 2015. ≈700M IPv4 and ≈600M IPv6 traceroutes Trace timeline Sa ➝ Sb is different from Sb ➝ Sa 18

Slide 19

Slide 19 text

time A B AS1-AS2-AS3-AS4 22.3 ms 1 2 3 3 4 • Extract two pieces of information from each traceroute ‣ AS path inferred from interfaces in the traceroute output ‣ end-to-end RTT between the two servers 19

Slide 20

Slide 20 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS2 AS3 AS4 23.1 AS1 AS5 AS9 AS4 18.2 AS1 AS5 AS9 AS4 17.9 A–B trace timeline (AS-path, end-to-end RTT) tuples spanning the study period 20

Slide 21

Slide 21 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS2 AS3 AS4 23.1 AS1 AS5 AS9 AS4 18.2 AS1 AS5 AS9 AS4 17.9 Popular AS path observed in A–B trace timeline AS1-AS2-AS3-AS4 with prevalence 60% 21

Slide 22

Slide 22 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ECDF Prevalence of popular AS paths IPv4 IPv6 AS path prevalence — Vern Paxson, IEEE/ACM Transactions on Networking 1997 22

Slide 23

Slide 23 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ECDF Prevalence of popular AS paths IPv4 IPv6 Most paths had one dominant route, with 80% dominant for at least half the period. 23

Slide 24

Slide 24 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS2 AS3 AS4 23.1 AS1 AS5 AS9 AS4 18.2 AS1 AS5 AS9 AS4 17.9 Number of AS-path changes observed in the A–B trace timeline 24

Slide 25

Slide 25 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 100 1000 ECDF Number of changes per trace timeline IPv4 IPv6 80% of the trace timelines experienced 20 or fewer changes over the course of 16-months. 25

Slide 26

Slide 26 text

How do the AS-path changes affect the baseline RTT of server-to-server paths? 26

Slide 27

Slide 27 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS2 AS3 AS4 23.1 AS1 AS5 AS9 AS4 18.2 AS1 AS5 AS9 AS4 17.9 Group RTTs by AS paths. Baseline: 10th-percentile of each AS-path (bucket). 27

Slide 28

Slide 28 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS2 AS3 AS4 23.1 AS1 AS5 AS9 AS4 18.2 AS1 AS5 AS9 AS4 17.9 Optimal Path: path with lowest baseline. Optimal: AS1-AS5-AS9-AS4 Sub-Optimal: AS1-AS2-AS3-AS4 28

Slide 29

Slide 29 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS2 AS3 AS4 23.1 AS1 AS5 AS9 AS4 18.2 AS1 AS5 AS9 AS4 17.9 Baseline of sub-optimal path with prevalence of 60% is ~4.5 ms increase in end-to-end RTT. 29

Slide 30

Slide 30 text

0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1 Fraction of trace timelines Prevalence of sub-optimal AS paths v4: RTT inc. >= 100 ms v6: RTT inc. >= 100 ms v4: RTT inc. >= 50 ms v6: RTT inc. >= 50 ms v4: RTT inc. >= 20 ms v6: RTT inc. >= 20 ms Typically a routing change causes only a small change in RTT. 30

Slide 31

Slide 31 text

0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1 Fraction of trace timelines Prevalence of sub-optimal AS paths v4: RTT inc. >= 100 ms v6: RTT inc. >= 100 ms v4: RTT inc. >= 50 ms v6: RTT inc. >= 50 ms v4: RTT inc. >= 20 ms v6: RTT inc. >= 20 ms But for a minority of cases, the change can be significant. 10% of trace timelines over IPv4 the (sub-optimal) AS paths that led to at least a 20 ms increase in RTTs had a prevalence of at least 30% 31

Slide 32

Slide 32 text

Effect of periods of daily oscillation on end-to-end RTTs 32

Slide 33

Slide 33 text

Data Set: Short Term • ≈3,500 server clusters in 1,000 locations in 100 different countries. 33

Slide 34

Slide 34 text

ping measurements every 15 minutes for one week from Feb. 22, 2015 through Feb. 28, 2015. ≈2.9M IPv4 and ≈1M IPv6 server pairs Based on Time Sequence Latency Probes by Luckie et al., IMC 2014 34 ping measurements over full-mesh Use FFT to select congestion candidates Perform traceroute campaigns Infer location of congestion

Slide 35

Slide 35 text

A B 35 end-to-end RTT

Slide 36

Slide 36 text

A B Identify first segment with high-correlation with end- to-end RTT? 36 end-to-end RTT 1 2 3

Slide 37

Slide 37 text

3155 links were congested in our study of IPv4 traceroutes. 1768 internal & 1121 interconnection links. Weighting links by the number of server-to-server paths that cross them … interconnection links are more popular! Large majority of the interconnection links with congestion were private interconnects. 37 Highlights

Slide 38

Slide 38 text

msec density 0 20 40 60 80 100 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 All interconnection All internal US−US interconnection US−US internal Typical overhead due to congestion is 20-30 ms. 38

Slide 39

Slide 39 text

msec density 0 20 40 60 80 100 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 All interconnection All internal US−US interconnection US−US internal Values between 20-30 ms — US: accounts for 90% of density. Europe & Asia: accounts for 30% of density. 39

Slide 40

Slide 40 text

msec density 0 20 40 60 80 100 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 All interconnection All internal US−US interconnection US−US internal Transcontinental links in Europe & Asia. 40

Slide 41

Slide 41 text

Routing changes typically do not affect end-to-end RTTs. Congestion is not the norm. 41

Slide 42

Slide 42 text

What about non-typical cases? 42

Slide 43

Slide 43 text

43 Congestion Only 2% of the server pairs over IPv4, and just 0.6% over IPv6, experience a strong diurnal pattern with an increase in RTT of least 10 ms. Routing For10% of server pairs the (sub-optimal) AS paths that led to 20 ms increase in RTTs pertained for at least 30% of the study period for IPv4 & 50% for IPv6.

Slide 44

Slide 44 text

44 Congestion Only 2% of the server pairs over IPv4, and just 0.6% over IPv6, experience a strong diurnal pattern with an increase in RTT of least 10 ms. Routing 10% of trace timelines the (sub-optimal) AS paths that led to at least 20 ms increase in RTTs pertained for at least 30% of the study period for IPv4 & 50% for IPv6.

Slide 45

Slide 45 text

- Focus on bandwidth - No packet loss measurements; platform limitations - Explore IPv4 & IPv6 infrastructure sharing 45

Slide 46

Slide 46 text

46

Slide 47

Slide 47 text

47

Slide 48

Slide 48 text

Use measurements over paths between CDN servers to understand the state of the Internet core. 48 End Users End Users Origin Servers CDN Servers Internet’s Core CDN Servers

Slide 49

Slide 49 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS2 AS3 AS4 23.1 AS1 AS5 AS9 AS4 18.2 AS1 AS5 AS9 AS4 17.9 Number of unique AS paths observed in the A–B trace timeline AS1– AS2– AS3– AS4 and AS1– AS5– AS9– AS4 49

Slide 50

Slide 50 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 100 ECDF Number of AS paths per trace timeline IPv4 IPv6 80% of trace timelines have 5 or fewer AS paths in IPv4, and 6 or fewer in IPv6. 50

Slide 51

Slide 51 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 100 ECDF Number of AS paths per trace timeline IPv4 IPv6 80% of trace timelines have 5 or fewer AS paths in IPv4, and 6 or fewer in IPv6. 51

Slide 52

Slide 52 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 100 ECDF Number of AS paths per trace timeline IPv4 IPv6 80% of trace timelines have 5 or fewer AS paths in IPv4, and 6 or fewer in IPv6. 52

Slide 53

Slide 53 text

time AS1 AS2 AS3 AS4 22.3 AS1 AS2 AS3 AS4 29.7 AS1 AS5 AS9 AS4 18.2 Combine AS paths observed in the forward direction with 53

Slide 54

Slide 54 text

AS1 AS2 AS3 AS4 22.3 29.1 AS1 AS8 AS3 AS4 28.8 AS1 AS8 AS3 AS4 AS1 AS5 AS9 AS4 18.2 AS1 AS2 AS3 AS4 29.7 24.9 AS1 AS8 AS3 AS4 time Combine AS paths observed in the forward direction with those in the reverse direction. 54

Slide 55

Slide 55 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 100 ECDF Number of AS-path pairs per server pair IPv4 IPv6 Pairing AS paths in the forward & reverse directions still reveals 80% of server pairs to have 8 or fewer path pairs in IPv4, and 9 or fewer in IPv6. 55

Slide 56

Slide 56 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 100 ECDF Number of AS-path pairs per server pair IPv4 IPv6 Pairing AS paths in the forward & reverse directions still reveals 80% of server pairs to have 8 or fewer path pairs in IPv4, and 9 or fewer in IPv6. 56

Slide 57

Slide 57 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 100 ECDF Number of AS-path pairs per server pair IPv4 IPv6 Pairing AS paths in the forward & reverse directions still reveals 80% of server pairs to have 8 or fewer path pairs in IPv4, and 9 or fewer in IPv6. 57

Slide 58

Slide 58 text

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -100 -50 0 50 100 ECDF Difference in RTT (in ms): RTTv4 - RTTv6 All Same AS-paths 58

Slide 59

Slide 59 text

Comparing magnitudes of increase in (baseline) 10th percentile of RTTs of AS paths (each relative to the best AS path of the corresponding trace timeline) with the lifetime of AS paths … 59

Slide 60

Slide 60 text

X-axis: deciles of the distribution of AS-path lifetimes. half-open intervals [0.0, 3.0h) has no data points Same value for 0th% and 10th% of the AS-path lifetime distribution 60

Slide 61

Slide 61 text

Y-axis: deciles of the distribution of magnitudes of increase in 10th percentile of RTTs of AS paths (each relative to the best AS path of the corresponding trace timeline). 61

Slide 62

Slide 62 text

Baseline RTTs of AS paths with longer lifetimes are close in value to that of the best AS path of corresponding trace timelines. 62

Slide 63

Slide 63 text

Paths with poor-performance are often those with relatively short lifetimes. 63

Slide 64

Slide 64 text

Similar observations from IPv6 traceroutes. 64

Slide 65

Slide 65 text

65