How the Internet Works (with notes)

How the Internet Works (with notes)

Cf611565f0b79abd0b3dc200ad867661?s=128

Noah Kantrowitz

June 25, 2015
Tweet

Transcript

  1. Noah Kantrowitz Open Source Bridge 2015 How The Internet Works

    The Life and Times of an HTTP Request
  2. It's not a big truck. It's a series of tubes.

    Ted Stevens What is the internet? Senator Stevens was more right than he understood, at a low level the internet is a series of tiny tubes connected in a vast network.
  3. We never, ever in the history of mankind have had

    access to so much information so quickly and so easily. Vint Cerf A nicer way to think about it.
  4. But let's talk about one bit of information in particular,

    we want to see a web page, https:// www.google.com/. We're going to focus on the parts of this process that interact with the internet, no keyboard interrupts or HTML rendering.
  5. https:/ /www.google.com/ We've opened a web browser and typed in

    https://www.google.com/.
  6. https:/ /www.google.com/ The first thing the browser needs to know

    is "Where is www.google.com?"
  7. DNS Enter stage right: the Domain Name System, or DNS

    for short.
  8. DNS • Map names to IP addresses. • gethostbyname() •

    RFC 1034 & 1035. • "What is the A for www.google.com?" DNS is the protocol your computer uses to find what the IP address for a name is. The most common way to interact with DNS is the gethostbyname() function, but some browsers are fancy enough to write their own internal version of that. The process is the same either way though.
  9. DNS Header ID ID ID ID ID ID ID ID

    ID ID ID ID ID ID ID ID QR Opcode Opcode Opcode Opcode AA TC RD RA Z Z Z RCODE RCODE RCODE RCODE QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT We need to make a DNS query, so we need to create a DNS message or packet. This will take the question "What is the A of www.google.com?" and put it in a format a DNS server can understand. The first section in any DNS packet is the headers. Like other protocols you may be familiar with, the headers include some standard data the applies to all types of packets.
  10. DNS Header ID ID ID ID ID ID ID ID

    ID ID ID ID ID ID ID ID QR Opcode Opcode Opcode Opcode AA TC RD RA Z Z Z RCODE RCODE RCODE RCODE QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT The ID field is an opaque number that the DNS server will include with the reply so we know which query is being answered.
  11. DNS Header ID ID ID ID ID ID ID ID

    ID ID ID ID ID ID ID ID QR Opcode Opcode Opcode Opcode AA TC RD RA Z Z Z RCODE RCODE RCODE RCODE QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT The QR and Opcode fields set the type of packet and if it is a query or reply.
  12. DNS Header ID ID ID ID ID ID ID ID

    ID ID ID ID ID ID ID ID QR Opcode Opcode Opcode Opcode AA TC RD RA Z Z Z RCODE RCODE RCODE RCODE QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT The other fields on that line set various mode flags for the packet.
  13. DNS Header ID ID ID ID ID ID ID ID

    ID ID ID ID ID ID ID ID QR Opcode Opcode Opcode Opcode AA TC RD RA Z Z Z RCODE RCODE RCODE RCODE QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT QDCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT ANCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT NSCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT ARCOUNT The four count fields set the number of question, answer, authority, and additional records sections. For now we're just sending one question so that will be one and others will all be zero.
  14. DNS Question QNAME QTYPE QCLASS The next section we need

    after the header is the question we are asking.
  15. DNS Question QNAME QTYPE QCLASS This is a bit simpler,

    we have the name we are asking for.
  16. DNS Question QNAME QTYPE QCLASS The type of record we

    are asking for, A.
  17. DNS Question QNAME QTYPE QCLASS And the class of value

    we are asking for, generally always set to 1 for INternet.
  18. DNS Message 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 0 0 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 So let's put the header and question sections together and fill all the fields. QNAME 03777777 06676F6F676C65 03636F6D00
  19. DNS Message 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 0 0 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 This looks a bit more intimidating, but it's the same header ...
  20. DNS Message 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 0 0 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ... and data sections.
  21. DNS Message 0001010000010000 0000000003777777 06676F6F676C6503 636F6D0000010001 Turn all the binary

    into bytes and we're ready to hit the road!
  22. Now what? We have to send that query to a

    DNS server. Your operating system exposes a way to ask what the DNS servers are: GetNetworkParams for Windows, NetworkServices for Mac, and resolver for Linux. Usually there are multiple servers in case some aren't available but let's just look at the first one, 208.201.224.11 (the main DNS server for my ISP).
  23. IP and UDP Now that we know where to send

    our DNS query, we need to encode that data in some way that the internet will understand it. We do this using two separate but related protocols. The Internet Protocol is, as the name suggests, the core protocol that powers the internet. At the time it was made, networking protocols were all custom developed by each company that sold computers, leading to difficulty connecting different networks together when they were from different manufacturers. IP allowed true inter-network communications which was the nucleus for the internet we know today.
  24. IP and UDP • Address and port. • 208.201.224.11:53 •

    Wrapped in order. • DNS inside UDP inside IP. • RFC 791 & 768. These protocols are applied in a nested fashion, the DNS query goes inside UDP, which goes inside IP. The IP protocol lets us specify the destination IP address which identifies the target computer, and the UDP protocol is where we specify the port which indicates which program or service on that computer we want to talk to. By mutual agreement of everyone that works with DNS, it is expected that DNS server programs use port 53.
  25. UDP Header Source Port Destination Port Length Checksum Data Data

    As the first layer on top of our DNS query will be the UDP data, let's look at that format.
  26. UDP Header Source Port Destination Port Length Checksum Data Data

    Most of these we aren't really concerned with, but the destination port is where we put in 53. The source port will be randomly allocated by the operating system.
  27. IP Header Version Version Version Version IHL IHL IHL IHL

    Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Flags Flags Flags Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Protocol Protocol Protocol Protocol Protocol Protocol Protocol Protocol Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Padding Padding Padding Padding Padding Padding Padding Padding Then on top of the UDP header we have the IP header. This has a lot more information in it.
  28. IP Header Version Version Version Version IHL IHL IHL IHL

    Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Flags Flags Flags Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Protocol Protocol Protocol Protocol Protocol Protocol Protocol Protocol Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Padding Padding Padding Padding Padding Padding Padding Padding First is the version of the IP protocol we are using. When people say IPv4, that's because the version is actually 4. Fun fact, there was an IPv5 at one point. It was an early voice streaming proposal from 1979.
  29. IP Header Version Version Version Version IHL IHL IHL IHL

    Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Flags Flags Flags Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Protocol Protocol Protocol Protocol Protocol Protocol Protocol Protocol Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Padding Padding Padding Padding Padding Padding Padding Padding Next there are a bunch of fields used to make sure the IP packet can be received correctly. Note that IP (and UDP) checksums apply only to the header information, not the data.
  30. IP Header Version Version Version Version IHL IHL IHL IHL

    Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Flags Flags Flags Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Protocol Protocol Protocol Protocol Protocol Protocol Protocol Protocol Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Padding Padding Padding Padding Padding Padding Padding Padding The Time to Live, or TTL, gives packets a maximum lifespan. It isn't actually measured in time, but in hops between systems and is in there to avoid packets being sent around in a loop forever. Eventually the TTL will hit zero and the packet will be dropped. The protocol field tells the receiver what kind of data is inside the IP packet, for us it will be 17 meaning UDP.
  31. IP Header Version Version Version Version IHL IHL IHL IHL

    Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Type of Service Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Total Length Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Identification Flags Flags Flags Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Fragment Offset Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Time to Live Protocol Protocol Protocol Protocol Protocol Protocol Protocol Protocol Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Header Checksum Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Source Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Options Padding Padding Padding Padding Padding Padding Padding Padding The two addresses indicate where this packet is going and where it came from. These are what we will use to steer the packet to its final destination on the internet. For our case the source address will be our local IP address and the destination address will be 208.201.224.11, our DNS server.
  32. IP Packet IP Header UDP Header DNS Packet So here

    we have our fully assembled packet. An IP header, a UDP header, and then a DNS message made of a DNS header and a DNS question. We can convert this to bytes just like before, there is nothing special about these bytes except in how all devices agree to parse and use them.
  33. Local Network We have an IP packet ready to send,

    but where are we sending it and how?
  34. Local Network • Computer • Switch • Router • Modem

    We're going to look a simple local network. Our computer is connected via wired network to a switch, the switch is also connected to a router, and the router is connected to a Cable or DSL modem. In most modern devices, the switch, router, and modem are all in one physical box but the functions are still discrete. We'll mostly ignore the modem itself as it acts like a bridge to the modem at your ISP and generally doesn't interact with Internet-level things.
  35. Most of us have seen a network configuration like this

    before. An IP address and subnet mask handed out by a router acting as a DHCP server. We won't go in to the details of DHCP, but let's talk a bit about what those three values are used for. The IP address determines who we are on an IP network, while the other two are used to build a route table.
  36. A route table is how your computer knows where to

    send packets. For a normal laptop or desktop, the route table will generally be very small, in this case just three entries.
  37. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    default 192.168.1.1 eth0 So let's use the route table to figure out where we are going to send this packet. We have the destination IP address at the top, and on the left we have a set of address prefixes to match against.
  38. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    default 192.168.1.1 eth0 First we have a route for addresses that start with 127. 127.0.0.1 is the usual address for localhost but technically everything under 127.0.0.0 is sent to the loop-back adapter (lo). Our target IP address doesn't start with 127 so this row doesn't match.
  39. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    default 192.168.1.1 eth0 Next we have the route for the local network. If there was another computer on the same local network this route would let us talk to them directly instead of going through the router. The subnet mask we saw before is combined with our computer's IP address to make the destination check for this route. However it doesn't match this IP address so we move on.
  40. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    default 192.168.1.1 eth0 Finally we reach the default route. This always matches anything that isn't handled by one of the other routes. That is why some operating systems call this a default gateway. So we will use this route, we know we are going to send the packet on the eth0 interface and to a gateway at 192.168.1.1. Next up, how do we send our IP packet on eth0?
  41. Ethernet • Wire framing. • MAC address. • Segments. •

    IEEE 802.3 Ethernet is the next layer we have to pass through on our journey to the internet. Ethernet is the standard that defines how wired network devices talk to each other. When you see a MAC address, that is talking about Ethernet. WiFi is a similar standard, also using MAC addresses and variants on the Ethernet data formats but for simplicity let's just look at a wired network. Ethernet networks divided in to segments in a similar way to IP networks in to subnets.
  42. Segments • Electrical broadcast. • Thicknet. • 10BASE-T and hubs.

    • Switches. A segment in Ethernet is defined as a set of network cards that are on a shared electrical connection. In the original 802.3 standard, it required every computer in the network be connected to the same huge coaxial cable. Every computer would be connected to a shared wire and they would use that to communicate. Over time it was hard to scale that as networks got bigger, so we introduced repeaters and hubs. Later we switched to twisted-pair copper cables from each computer to the hub. We now had multiple segments but a single collision domain as a hub simply repeats data from one port to all other ports. This meant that anyone on a hub trying to send data could interfere with anyone else. As network collisions became more of a problem with faster network speeds, we moved to switches which send data only to the required segment.
  43. Ethernet Frame Preamble Preamble Preamble Preamble Preamble Preamble Preamble SFD

    Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Source Address Source Address Source Address Source Address Source Address Source Address Type Type Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding FCS FCS FCS FCS So far we have had a DNS message, an IP packet, now we have another layer; an Ethernet frame. A frame is one block of data sent on an Ethernet network to another device.
  44. Ethernet Frame Preamble Preamble Preamble Preamble Preamble Preamble Preamble SFD

    Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Source Address Source Address Source Address Source Address Source Address Source Address Type Type Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding FCS FCS FCS FCS Most of the fields are control data for the network cards on either end. The Type field indicates what kind of data is inside the frame, in our case that will be 0800 for IPv4. You will often see descriptions of an Ethernet frame omit the first two fields as they are always the same and not usually of interest.
  45. Ethernet Frame Preamble Preamble Preamble Preamble Preamble Preamble Preamble SFD

    Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Source Address Source Address Source Address Source Address Source Address Source Address Type Type Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding FCS FCS FCS FCS The destination address determines who on the collision domain or segment should be receiving this frame.
  46. Ethernet Frame Preamble Preamble Preamble Preamble Preamble Preamble Preamble SFD

    Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Source Address Source Address Source Address Source Address Source Address Source Address Type Type Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding FCS FCS FCS FCS And finally the data field contains the IP packet we built before.
  47. 1000-BASET • Electrical signaling. • IEEE 802.3ab While Ethernet defines

    the overall network, other standards define the actual electrical signals to put on the wire or light pulses to put on the fiber optics. Most wired networks these days will be 1000-BASET, more generally known as Gigabit Ethernet.
  48. Ethernet Frame Preamble Preamble Preamble Preamble Preamble Preamble Preamble SFD

    Destination Address Destination Address Destination Address Destination Address Destination Address Destination Address Source Address Source Address Source Address Source Address Source Address Source Address Type Type Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding Data and padding FCS FCS FCS FCS So we are ready to build our Ethernet frame to send our DNS query to the local router, but we have a problem. The route table told us the IP address of the gateway, but to send an Ethernet frame we need its Ethernet address (aka MAC address).
  49. ARP We have to pause sending our frame to look

    up the Ethernet address for 192.168.1.1. We do this using a protocol called ARP, for Address Resolution Protocol.
  50. ARP • Bridge between IP and Ethernet. • NDP for

    IPv6. • "Who has IP address 1.2.3.4?" • In an Ethernet frame. • Broadcast FF:FF:FF:FF:FF:FF. ARP sits between Ethernet and IP, helping to translate between hardware and network addresses. We are only going to talk about ARP for IPv4 and Ethernet, but there are similar protocols for other pairs of network types. For IPv6 it is called NDP. A basic ARP query is "Whoever has IP address X please send me your Ethernet address." Each ARP request is sent in an Ethernet frame by necessity, but it uses the broadcast address so that everyone on the segment will receive and process the query. You will note there is no security here, anyone on the segment can respond to any query. Abuse of this is called ARP poisoning.
  51. ARP Packet HTYPE PTYPE HLEN PLEN OPER SHA SHA SHA

    SHA SPA SPA SPA THA THA THA THA TPA TPA TPA ARP packets are intentionally simple. Requests and replies use the same format, just with some fields set to 0 when they do not apply.
  52. ARP Packet HTYPE PTYPE HLEN PLEN OPER SHA SHA SHA

    SHA SPA SPA SPA THA THA THA THA TPA TPA TPA The important fields are the two pairs of addresses. For our query we will fill in the sender hardware address and sender protocol address using our MAC and IP addresses.
  53. ARP Packet HTYPE PTYPE HLEN PLEN OPER SHA SHA SHA

    SHA SPA SPA SPA THA THA THA THA TPA TPA TPA The target hardware address will be left empty since that is what we are requesting, and the target protocol address will be set to 192.168.1.1.
  54. ⌚ Then we wait. ARP can take a few milliseconds

    to respond as it has to get to the other computer, be decoded, and the reply has to make it back. Because of this ARP is aggressively cached at almost every network device. In our case we don't have the answer cached so nothing to do but wait.
  55. Sending • Ethernet – 40:4A:3:ED:D2:1C • IP – 208.201.224.11 •

    UDP – 53 • DNS – www.google.com A But at long last, in computer terms, we get our ARP reply. We finally have all the data we need to send our DNS query! To review, we have an Ethernet frame going to the hardware address of our router, inside that is an IP packet with a destination of our DNS server, inside that is a UDP datagram for port 53, and inside that is a DNS message asking for the A record for www.google.com. This is a lot of work and we haven't even made it to the local router yet!
  56. Local Router So the packet is sent down the wire,

    through the switch. The switch reads the destination MAC address on the Ethernet frame to know which switch port to send the packet to, or if it doesn't recognize the address it falls back to sending it on all ports and hoping it gets to the destination somehow. Either way it eventually makes it to the router.
  57. Local Router • Static routing (again). • NAT. The local

    router in our hypothetical home network serves two main purposes, routing between the home network and the ISP, and network address translation. Unlike the switch, it needs to decode most of the packet to do these but let's look at them in turn.
  58. Route Table Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 *

    eth0 173.228.34.0/24 * eth1 default 173.228.34.1 eth1 The first thing the router will do is decode the Ethernet frame. It has to check that it matches the destination MAC address on the frame, meaning it is the intended recipient of the frame. Then it decodes the IP packet headers and checks if it matches the destination IP address. In this case it does not match, so the router knows it needs to send it somewhere else. This works just like it did on our computer, but as the router is connected to both our home network and the ISP's network the route table has an extra entry.
  59. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    173.228.34.0/24 * eth1 default 173.228.34.1 eth1 Just like last time, we check the route table entries in order looking for a match. The address doesn't start with 127.
  60. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    173.228.34.0/24 * eth1 default 173.228.34.1 eth1 Or 192.168.1
  61. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    173.228.34.0/24 * eth1 default 173.228.34.1 eth1 Or 173.228.34
  62. 208.201.224.11 Destination Gateway Interface 127.0.0.0/8 * lo 192.168.1.0/24 * eth0

    173.228.34.0/24 * eth1 default 173.228.34.1 eth1 So just like last time we fall back to the default gateway. Here we can see it is 173.228.34.1 and we are going to use the eth1 interface.
  63. NAT • Rewrite IP and UDP header. • Source address

    and port. • Share one public address. Once the router figures it wants to use the eth1 interface, it checks with the operating system for any special policies on that network interface. On most home routers there will be a network address translation policy. This is because the local network uses private IP addresses that are reserved for internal use. If we sent our DNS query with a source IP address of 192.168.1.2, the DNS server would have no way to send the reply packet back to us. Network address translation fixes this by rewriting the source to its public IP address, and keeping a mapping of any translated packets so when a reply comes in, it can send it off to the correct internal IP address.
  64. Sending, Round Two • ARP lookup (or cached). • Ethernet

    frame. • Send to ISP border router. After NAT is finished, the sending process is just like before. We need the MAC address of the next router we are sending to. Once we have that we make a new Ethernet frame and send it on its way. We're going to skip over the cable or DSL modem as they are mostly transparent in this process but the quick version is they read the Ethernet frame and translate it to a similar frame in DOCSIS or PPPoE and then it gets translated back at the ISP's office. This first router we get to in our ISP's network is generally called a border router as it lives at the edge of the network.
  65. The Internet Our plucky little DNS query has finally made

    it to the internet! Not very far mind you, the local border router for your home connection is probably within a few kilometers of your house, but still this is progress.
  66. A Mesh of Trees • Tree-like at the edges. •

    Partial mesh in middle. • ~Full mesh in the core. So far both hops we have taken have used relatively small route tables with default gateways as a fallback for unknown destinations. This works fine for small scale networks that are arranged in a tree or hub-and-spoke pattern, but as we get further in to the ISP's network it will start looking more like a mesh instead of a purely tree-like arrangement.
  67. Next Hops • Regional routers. • ISP backbone. The same

    process we saw with our own router will usually repeat a few more times through the tree of regional routers in the local ISP office. Each of these will use their own route tables to check that we aren't trying to send a packet elsewhere on the regional network, and if not it sends up to the next router closer to the local backbone connection. The ISP backbone is a set of high-speed links between local or regional offices.
  68. Our DNS query has made it through the local ISP

    office and up to a backbone router. This router is plugged in to many high-speed optical links to other backbone routers run by the ISP in their other offices and data centers. Now we have a problem though, while so far everything has been tree-like, suddenly there is no single default gateway. The ISP's backbone isn't quite a fully connected mesh, but there are multiple paths available. How do we know where to send the packet?
  69. BGP The Border Gateway Protocol is the most widely used

    routing protocol on the internet today.
  70. Routing Protocol • Distribute routes. • Update over time. •

    Find optimal paths. What is a routing protocol? The simplest version is a way for routers to share and synchronize rows in their route tables. This is important as the structure of that mesh we saw before is constantly changing as hardware is swapped out and new links come online. Not to mention the natural enemy of the fiber link, backhoes. The internet is built to automatically detect these changes and re-route packets around them.
  71. BGP • Gossip based. • Prefix based. • Share best

    routes. • RFC 4271. BGP does this using a gossip based approach, each router shares all the routes it knows about with its immediate neighbors (called peers). These peering arrangement are generally hand-coded in to each router participating in BGP so this isn't entirely self-organizing. As before in our simple route tables, BGP operates on IP address prefixes. Somewhere on the global BGP mesh there is a router that announces it controls 208.201.224.0/19 which is the prefix containing our DNS server.
  72. BGP • Gossip based. • Prefix based. • Share best

    routes. • RFC 4271. Each BGP peer announces which prefixes it wants to be the destination for, and each neighbor tracks which peer it saw every prefix from. It compares each new announcement with all its existing routes and if the new route is better than an existing one it gets added to the route table and sent along to all of that router's peers. In this way each router gossips to the next about what it thinks the best routes are. This also means that every BGP router has to keep the route table for the entire world in memory. Currently this is about 600 thousand routes.
  73. Autonomous Systems • 1-232 (née 216) • ~51000 so far.

    • AS7065 Before we talk more about BGP we need to talk about Autonomous System numbers. We have seen a lot of addresses so far, UDP ports, IP addresses, MAC addresses. AS numbers are how BGP handes addressing. An AS represents a network operator that participates in the global BGP network. AS numbers were historically limited to 2 bytes but there have been recent moves to upgrade to 4 byte numbers as roughly 80% of them have been allocated. Some large companies have multiple AS numbers for geographically or logically distinct networks, but you can think of each as roughly equivalent to a company. In this case our DNS server is located in AS 7065. Each prefix announced on the BGP network is associated with a peer and an AS number.
  74. IANA • ICANN department • Internet Assigned Numbers Authority •

    5 Regional Internet Registries • AfriNIC, ARIN, APNIC • LACNIC, RIPE NCC I should also briefly mention IANA. It is a department of the Internet Corporation for Assigned Names and Numbers which more or less administers the internet. It handles allocating IP addresses and AS numbers through five regional organizations; AfriNic for Africa, ARIN for North America, APNIC for Asia and Australia, LACNIC for Latin America and the Caribbean, and RIPE for Europe, Russia, and the Middle East. These regional registries handle IP and AS allocations for their member countries.
  75. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm What determines a best route? I'm skipping a few of the weirder steps but overall it follows these rules when comparing two routes for the same prefix.
  76. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm Weight is an optional, local value configured for each route. It allows network operators to prefer cheaper or more reliable links or other manual overrides to the routing mesh.
  77. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm Local preference allows similar manual overrides to weight, but is shared between routers within the same AS while weight is local to one router.
  78. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm AS path is where things get interesting. As routes are gossip'd through the network, each router keeps track of which ASes the route advertisement passed through to get there. Having fewer ASes in that path means a more direct connection, which is preferred. This doesn't count the overall number of routers in the path, just distinct ASes.
  79. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm Origin refers to how the route was introduced to the BGP network, preferring routes from within the same AS instead of external routes.
  80. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm The multi-exit discriminator, or MED, allows an AS with multiple connections to another AS to request preferential routing on one of those links over the other. Both routes will still be considered in case one link goes down, but generally this is used to prefer routes along faster or newer fiber connections over old backup links.
  81. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm Metric prefers routes with fewer hops across an internal network. This is similar to AS Path but is comparing router hops within the same AS instead of distinct ASes.
  82. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm If we've gotten this far, we fall back to the oldest of the two routes. This helps reduce the effect of instability due to newer but equivalent routes being broadcast through the network.
  83. 1 Weight 5 MED 2 Local Pref 6 Metric 3

    AS Path 7 First 4 Origin 8 Tie Breaker BGP Algorithm After this there are a few tie breakers just to ensure there is always a stable solution that all routers can agree on.
  84. Security • BGP Hijacking • Ingress Filtering • Rarely used

    signatures A quick aside because this tends to be a surprise to most people. No where in this discussion of BGP have I mentioned authentication or authorization because for the most part there is none. Any AS participating in the BGP network can advertise any routes they want. Abuse of this is called BGP hijacking and we do see public incidents every few years, like in 2008 when the Pakistani national ISP accidentally sent out routes for the entire YouTube address space.
  85. Backbone Router • AS46375 to AS7065 BGP is always running

    in the background so by the time we get to the ISP's backbone router it should already have a route (in its table of 600 thousand routes) to send along our DNS query packet. While building this route table is much more complex, using it works the same way as the other hops we have seen. The DNS server is run by a different department of our ISP that happens to be in a different AS so we have to pass through a few hops along the backbone to reach the data center containing the DNS server.
  86. DNS Server Eventually, also known as about 60 milliseconds, we

    have reached the ISP's DNS server!
  87. Receiving • Decode and match IP. • Decode UDP port.

    • Deliver to process. Just like all the routers, the DNS server will decode the incoming Ethernet frame, read the IP headers, and check the destination IP address. This time it matches! The packet has reached its destination so the operating system decodes the UDP headers too and sees it is being sent to port 53. It checks some internal tables and sees that there is a program listening on that port. The operating system delivers the packet's data to the DNS server process and now the real work can begin.
  88. DNS Server • Decode question. • Check local cache. •

    Recursion? The DNS server unpacks our question data and sees we are asking for the A record for www.google.com. On a busy, public DNS server normally this record would already be cached locally but for our example we are going to say it isn't in the cache.
  89. DNS Recursion Our ISP's DNS server doesn't directly know the

    answer to our question. It could just reply back saying that it doesn't know, but that wouldn't be very useful. DNS has a mode called a "recursive query" that means we are asking the server to go find out the answer and then send it back to us. Not all servers will allow recursive queries, but ISP-level servers will for their customers at least.
  90. Root Servers • 13 DNS servers. • Fixed IP addresses.

    • Maps TLDs to DNS servers. • [a-m].root-servers.net So if we are going to find the value of this DNS query from scratch we need to start from something. The base of the DNS system are the 13 root servers. These live at 13 fixed IP addresses, though each address maps to hundreds or thousands of servers all over the world to handle the volume of requests. These root servers only map top-level domains like com, net, and org to TLD-level master nameservers. All root servers are interchangeable, so queries will generally just pick one at random.
  91. Recursion Round 1 • DNS query to 198.41.0.4 • com.

    IN NS a.gtld-servers.net • com. IN NS b.gtld-servers.net. • a.gtld-servers.net. IN A 192.5.6.30 • b.gtld-servers.net. IN A 192.33.14.30 Our ISP's DNS server has the hardwired list of the 13 root servers, called the root hint file, managed by IANA. It picks one of those, let's say a.root-server.net, and sends it a new request for the A record for www.google.com. Just like our DNS query, this new query gets encoded in a DNS message, wrapped in UDP, IP, and Ethernet headers, sent on to the local network, across several more backbone links from our ISP to other ISPs, eventually reaches the root server which makes a response packet and sends it back across the internet. Eventually we get a response like this. gtld-servers.net are the TLD-level DNS servers for the .com domain. Instead of being run by different organizations like the root servers, these are all run by Verisign as part of their ownership of the .com TLD.
  92. Recursion Round 2 • DNS query to 192.5.6.30 • google.com.

    IN NS ns1.google.com. • google.com. IN NS ns2.google.com. • ns1.google.com. IN A 216.239.32.10 • ns2.google.com. IN A 216.239.34.10 Again we pick a random server from the response and ask it again for the A record for www.google.com. The TLD servers are populated based on those "what is your nameserver?" boxes you fill out when you register a domain name. It doesn't know the answer to our question, but it looks up what name servers are attached to google.com and gives us those.
  93. Recursion Round 3 • DNS query to 216.239.32.10 • www.google.com.

    IN A 216.58.192.36 Now we pick one of Google's DNS servers and ask it for the A record for www.google.com. Finally we have found a server that can answer our question! Remember that for these three queries during the recursion process it has to do all the same work as our DNS packet so far 6 times, three requests and three replies. But at last our ISP's DNS server has an answer to our question, it stores it in the local cache for next time and builds a DNS reply packet to send back to us.
  94. DNS Reply • DNS message. • Headers & answer section.

    • Sent back over the wires. • Un-NAT. The DNS message we sent to the ISP's DNS server has two sections, headers and one question. The DNS server will now build a reply with headers and one answer section. After a journey over the ISP backbone again, the packet arrives at our local router. The router checks its NAT tables and sees this packet was translated on the way out. It reverses the translation and sends the packet up to our computer. It makes its way through our operating system until finally the reply data gets sent to our web browser.
  95. TCP We are ready to open a connection the web

    server at www.google.com. While DNS used UDP inside IP, web traffic generally uses TCP.
  96. TCP • Reliable ACKs. • Three-way handshake. • Congestion control.

    • RFC 675, 793, ... The internet is a dangerous place for packets, and while most get to their destination there is always some percentage that get dropped along the way. This is most often due to network congestion or data corruption on one of the intervening routers. With UDP, there is no real recourse for this. When our computer sent that DNS query it started a timer in the background and if the reply took too long it would assume the packet had been lost and sent a new copy. TCP offers more reliable stream-oriented behaviors on top of the unreliable internet. Packets will be re-sent if missing and will never be delivered out of order.
  97. TCP Headers Source Port Source Port Source Port Source Port

    Source Port Source Port Source Port Source Port Destination Port Destination Port Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Data Offset Reserved U R G A C K P S H R S T S Y N F I N Window Window Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Urgent Pointer Urgent Pointer Options Options Options Options Options Options Options Options Options Padding Like UDP there is a packet header added to all TCP packets, though it is a bit more complex than the UDP headers.
  98. TCP Headers Source Port Source Port Source Port Source Port

    Source Port Source Port Source Port Source Port Destination Port Destination Port Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Data Offset Reserved U R G A C K P S H R S T S Y N F I N Window Window Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Urgent Pointer Urgent Pointer Options Options Options Options Options Options Options Options Options Padding Again we have the source and destination ports.
  99. TCP Headers Source Port Source Port Source Port Source Port

    Source Port Source Port Source Port Source Port Destination Port Destination Port Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Data Offset Reserved U R G A C K P S H R S T S Y N F I N Window Window Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Urgent Pointer Urgent Pointer Options Options Options Options Options Options Options Options Options Padding The sequence and acknowledgement numbers are used during a TCP connection to track the order of packets being sent and which packets have been received correctly. The window size is the number of packets that can be sent at once before waiting for an ACK.
  100. TCP Headers Source Port Source Port Source Port Source Port

    Source Port Source Port Source Port Source Port Destination Port Destination Port Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Data Offset Reserved U R G A C K P S H R S T S Y N F I N Window Window Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Urgent Pointer Urgent Pointer Options Options Options Options Options Options Options Options Options Padding The checksum field is used to ensure the TCP headers don't get corrupted during transmission. The offset and urgent pointer fields are used to speed up decoding a bit, and options are rarely used.
  101. TCP Headers Source Port Source Port Source Port Source Port

    Source Port Source Port Source Port Source Port Destination Port Destination Port Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Sequence Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Acknowledgement Number Data Offset Reserved U R G A C K P S H R S T S Y N F I N Window Window Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Urgent Pointer Urgent Pointer Options Options Options Options Options Options Options Options Options Padding And finally a set of six one-bit control fields. These are used to set the mode of the packet.
  102. Three Way Handshake • SYN • SYN-ACK • ACK To

    establish a TCP connection we need to complete a three-packet exchange usually called a three way handshake. First we, the client, send a packet with the SYN, synchronize, flag and a random sequence number to the server. The server responds with a packet with the SYN and ACK, acknowledge, flags and its own random sequence number as well as the client's random sequence number in the acknowledgement number field. We then send a third and final packet to the server with the ACK flag and the server's sequence number in our acknowledgement number field. This ensures both sides know the other's starting sequence number and data packets can begin.
  103. Acknowledgements • Send 1 ... 10. • ACK 10. •

    Send 11 ... 20. • ACK 15. • Send 15 ... 24. TCP achieves reliable, in-order delivery by using sequence numbers and acknowledgements. The window value dictates how many packets one side can send the other before waiting for acknowledgement. If a packet is lost, the client can resend any required data to make up for it.
  104. Extras • Slow-start • Avoidance • Fast resend • Karn's

    • Nagle's • SACK • Scaling • CUBIC TCP has been retrofitted and upgraded with additional options and supplemental standards more than just about anything else in the history of the internet. Most of these are transparent improvements but if you want to learn about TCP as it exists in practice, you have a lot of reading to do.
  105. TLS With a TCP connection open, the browser can start

    to send data to www.google.com. If this was plain HTTP we could jump directly to requesting the web page, but most people these days will access websites over HTTPS. This means we need to create a TLS connection before moving on.
  106. TLS (aka SSL) • Stream encryption. • Mutual authentication. •

    RFC 5246. TLS means the same thing as SSL. SSL is just a name for an early version of TLS and was last seriously used in 1999. Just like TCP creates an abstraction for a reliable data stream, TLS creates an encrypted data stream. The data inside the stream doesn't have to know the details of how or why it is being encrypted. TLS also allows either side to identify themselves with a cryptographic proof, though most of the time only the server does this in the form of an HTTPS certificate.
  107. Crypto I'm not going to talk about the specifics of

    the cryptography involved in TLS. There are a ton of great talks about that and we would be here all day. I will just cover the packets and data involved in setting up a TLS connection and using it for HTTPS.
  108. TLS Handshake • ClientHello • ServerHello • ChangeCipherSpec • Finished

    Just like TCP has a handshake to set up a connection, so does TLS. We already have a TCP connection established at this point so TLS messages may not correspond directly to packets anymore, as TCP automatically slices up the data stream in to packets as needed to maximize efficiency.
  109. Client Round 1 • ClientHello The first message in any

    TLS connection is a ClientHello. This message includes a per- connection random value to be used later, and a list of ciphers and TLS versions the client supports.
  110. Server Round 1 • ServerHello • Certificate • ServerKeyExchange •

    ServerHelloDone The server then responds to the ClientHello with a ServerHello. This again includes the server's random number, and picks which TLS version and cipher the connection will use. After this the server sends a Certificate message with a proof of its identity. Then it sends a ServerKeyExchange with configuration information for the connection. Finally it sends ServerHelloDone meaning it is finished with the first round of the handshake.
  111. Client Round 2 • ClientKeyExchange • ChangeCipherSpec • Finished The

    client then uses the server's key data from the ServerKeyExchange to send its own ClientKeyExhange message, negotiating any required information for the cipher. Then is sends a ChangeCipherSpec message to switch from the initial unencrypted mode to encrypted communications. It then sends a Finished message to signal it is done with the handshake. This also doubles as a failsafe check for the encryption negotiation as if something has gone wrong the server will be unable to decrypt the Finished message.
  112. Server Round 2 • ChangeCipherSpec • Finished The server does

    the same thing as the client now, indicating all further messages will be encrypted and finishing the handshake.
  113. Application Data • Data wrapper. • Transparent. After this, the

    TLS connection goes effectively transparent, wrapping the HTTP data in TLS messages and encrypting them. On the other side the TLS system unwraps the data stream and presents it to the HTTP server.
  114. HTTP So a quick review, we have done a DNS

    query to find the IP address of www.google.com, we established a TCP connection, and then a TLS connection. We are now ready to send an HTTP request to get our webpage!
  115. HTTP • Request and response. • Verbs, paths, codes. •

    RFC 2616. HTTP is probably the piece of this journey you are all most familiar with. It is a text-based protocol with a request/response model just like DNS we saw before.
  116. HTTP Request • GET / HTTP/1.1 • Host: www.google.com An

    HTTP request has three parts. First is the status line. This has the verb (GET) the path (/) and the HTTP version (1.1). After that we have the headers section. The Host header is required for all HTTP 1.1 requests, but there are many other optional headers your browser might send. After that we have the request body, which for a GET request will be empty.
  117. Sending • TLS • TCP • IP • Ethernet So

    we have our connection open, and our HTTP request ready. Off it goes! Just as a reminder, our HTTP request data is going to be wrapped in TLS messages, which will be wrapped in TCP packets, inside IP packets, inside Ethernet frames. And then this whole pile of data will be shipped off to our local router where it will follow the same kind of journey as our other packets so far. This time rather than staying within our ISP's network, it will jump across multiple networks until it finally gets to Google's nearest servers.
  118. HTTP Response • HTTP/1.1 200 OK • Content-Length: 17914 •

    \r\n • <!doctype html><html ... Assuming everything so far has gone smoothly, we will eventually get an HTTP response back from Google's server. The first line is special again, including the response protocol version and the status code. After that we have a bunch of headers again, and then a single blank line followed by the body. HTTP is a very generic protocol, able to handle data like images and video files, but in this case the response will be HTML data.
  119. After the first request, there will be a few others

    to load things like the Google logo but overall we've accomplished our goal! We managed to brave the internet, get our packets all the way from our computer to Google's servers, and in the end we are rewarded with the search box we all know and possibly love.
  120. ⌚ It is important to remember that this half hour

    discussion covers about 150 milliseconds of actual time. For all its failings the internet is remarkably fast, resilient, and useful. Even more so when you remember much of what we've seen was developed in the 70s and 80s.
  121. How does the Internet work? So now the final question,

    how does the internet work?
  122. Surprisingly well! Surprisingly well!

  123. Questions? @kantrn coderanger.net Thank you!