Networking @Scale'19 - Getting a Taste of Your Network - Sergey Fedorov
Sergey Fedorov, Senior Software Engineer at Netflix, describes a client-side network measurement system called "Probnik", and how it can be used to improve performance, reliability and control of client-server network interactions.
OS release DNS issue Last mile network issue Internet Congestion Route leak AWS outage AWS microservice release ... CDN API Acceleration Private Backbone Video DNS Small assets Client
name: DNS test targets: probe.dnsA.me/probe probe.dnsB.me/probe probe.dnsC.me/probe probe.dnsA.me -> 1.2.3.4 probe.dnsB.me -> 1.2.3.4 probe.dnsC.me -> 1.2.3.4 1.2.3.4 Auth DNS A: OK / FAIL Auth DNS B: OK / FAIL Auth DNS C: OK / FAIL DETECTING NETWORK ISSUES Beyond HTTP Reachability: Auth DNS Availability
Cloud Region Connectivity type: HTTP GET name: Cloud region test targets: us-east.test.me/probe us-west.test.me/probe eu-west.test.me/probe Auth DNS A: OK / FAIL Auth DNS B: OK / FAIL Auth DNS C: OK / FAIL
HTTP GET name: steering test targets: cloud.test.me/probe ix-cloud.test.me/probe isp-cloud.test.me/probe isp-ix-cloud.test.me/probe REMEDIATION 4 Network Paths to Reach the Cloud Private Backbone
OK / FAIL ix-cloud: OK / FAIL isp-cloud: OK / FAIL Isp-ix-cloud: OK / FAIL type: HTTP GET name: steering test targets: cloud.test.me/probe ix-cloud.test.me/probe isp-cloud.test.me/probe isp-ix-cloud.test.me/probe REMEDIATION Probe for Reachability Private Backbone
- ISP’s connection to AWS Can we fix it? - YES - Move traffic via the IX CDN server REMEDIATION Remediation for Broken Path ISP IX Cloud CLOUD CDN IX-CLOUD CDN ISP-CLOUD ISP-IX-CLOUD Private Backbone
- ISP outage or client last mile Can we fix it? - NO (we don’t have a routable path) REMEDIATION Remediation for Full Isolation ISP IX Cloud CLOUD CDN IX-CLOUD CDN ISP-CLOUD ISP-IX-CLOUD Private Backbone
the ipv6 deployment type: HTTP GET name: ipv6 test targets: ipv4.test.me/probe ipv6.test.me/probe Compare ipv6 to ipv4 on probe traffic - find differences without PROD impact
1 AWS Cloud Site 2 Site 3 Site N Private Backbone Prod Traffic: - RPS - Gbs In - Gbs Out Want to move client traffic from client -> cloud to client -> IX -> cloud
1 AWS Cloud Site 2 Site 3 Site N Private Backbone IX-Cloud type: HTTP GET name: IX Steering targets: policy.ixaws.me/probe Prod Traffic: - RPS - Gbs In - Gbs Out
siteN: % probes Internet IX PREVENTION Netflix Example: Provisioning the Backbone AWS Site 1 AWS Cloud Site 2 Site 3 Site N Private Backbone IX-Cloud type: HTTP GET name: IX Steering targets: policy.ixaws.me/probe Prod Traffic: - RPS - Gbs In - Gbs Out
siteN: % probes Internet IX PREVENTION Netflix Example: Provisioning the Backbone AWS Site 1 AWS Cloud Site 2 Site 3 Site N Private Backbone IX-Cloud Client-IX Steering Policy AWS Region Steering Policy type: HTTP GET name: IX Steering targets: policy.ixaws.me/probe Prod Traffic: - RPS - Gbs In - Gbs Out
Topology Prod: RPS, Gbs In, Gbs Out Client to IX Site Steering Policy IX to AWS Region Steering Policy Input: Probes + Prod Traffic Variations Objective(s): - min latency - min cost - min risk - ... Backbone link -> <traffic> link1: <gbs> link2: <gbs> link3: <gbs> ... linkN: <gbs>