Lin u X u niL: 12/27/09

This is a perennial favorite in technical interviews: "so you type 'www.example.com' in your favorite web browser. In as much detail as you can, tell me what happens."
Let's assume that we do this on a Linux (or other AT&T System V UNIX system). Here's what happens, in enough detail to make your eyes bleed.

Your web browser invokes the gethostbyname() function to turn the hostname you entered into an IP address. (We'll get back to how this happens, exactly, in a moment.)
Your web browser consults the /etc/services file to determine what well-known port HTTP resides on, and finds 80.
Two more pieces of information are determined by Linux so that your browser can initiate a connection: your local IP address, and an ephemeral port number. Combined with the destination (server) IP address and port number, these four pieces of information represent what is called an Internet PCB, or protocol control block. Furthermore, the IANA defines the port range 49,152 through 65,535 for use in this capacity. Exactly how a port number is chosen from this range depends upon the Linux kernel version. The most common allocation algorithm is to simply remember the last-allocated number, and increment it by one each time a new PCB is requested. When 65,535 is reached, the algorithm loops around to 49,152. (This has certain negative security implications, and is addressed in more detail in Port Randomization by Larsen, Ericsson, et al, 2007.) Also see TCP/IP Illustrated, Volume 2: The Implementation by Wright and Stevens, 1995.
Your web browser sends an HTTP GET request to the remote server. Be careful here, as you must remember that your web browser does not speak TCP, nor does it speak IP. It only speaks HTTP. It doesn't care about the transport protocol that gets its HTTP GET request to the server, nor how the server gets its answer back to it.
The HTTP packet passes down the four-layer model that TCP/IP uses, from the application layer where your browser resides to the transport layer. This is a connectionless layer, with addressing based upon URLs, or uniform resource locations.
The transport layer encapsulates the HTTP request inside TCP (transmission control protocol. Transport layer for transmission control, makes sense, right?) The TCP packet is then passed down to the second layer, the network layer. This is a connection-based or persistent layer, with addressing based upon port numbers. TCP does not care about IP addresses, only that some specific port on the client side is bound to a specific port on the server side.
The network layer uses IP (Internet protocol), and adds an IP header to the TCP packet. The packet is then passed down to the first layer, the link layer. This is a connectionless or best-effort layer, with addressing based upon 32-bit IP addresses. Routing, but not switching, occurs at this layer.
The link layer uses the Ethernet protocol. This is a connectionless layer, with addressing based upon 48-bit Ethernet addresses. Switching occurs at this layer.
The kernel must determine what connection over which to send the packet. This happens by taking the IP address and consulting the routing table (seen by running netstat -rn.) First, the kernel attempts to match the destination by host address. (For example, if you have a specific route to just the one host you're trying to reach in your browser.) If this fails, then network address matching is tried. (For example, if you have a specific route to the network in which the host you're trying to reach resides.) Lastly, the kernel searches for a default route entry. This is the most common case.
Now that the kernel knows the next hop, that is, the node that the packet should be handed off to, the kernel must make a physical connection to it. Routing depends upon each node in the chain having a literal electrical connection to the next node; it doesn't matter how many nodes (or hops) the packet must pass through so long as each and every one can "see" its neighbor. This is handled on the link layer, which if you'll recall uses a different addressing scheme than IP addresses. This is where ARP, or the address resolution protocol, comes into play. Let's say your machine is 1.2.3.4, and the default gateway is 1.2.3.5. The kernel will send an ARP broadcast which says, "Who has 1.2.3.5? Tell 1.2.3.4." The default gateway machine will see the ARP request and reply, saying "Hey 1.2.3.4, 1.2.3.5 is 8:0:20:4:3f:2a." The kernel places the answer in the ARP cache, which can be viewed by running arp -a. Now that this information is known, the kernel adds an Ethernet header to the packet, and places it on the wire.
The default gateway receives the packet. First, it checks to see if the Ethernet address matches its own. If it does not, the packet is silently discarded (unless the interface is in promiscuous mode.) Next, it checks to see if the destination IP address matches any of its configured interfaces. In our scenario here, it does not: remember that the packet is being routed to another destination by way of this gateway. So the gateway now checks to see if it is configured to permit IP forwarding. If it is not, the packet is silently discarded. We'll assume the gateway is configured to forward IP, so now it must determine what to do with the packet. It consults its routing table, and attempts to match the destination in the same way our web browser system did a moment ago: exact host match first, then network, then default gateway. Yes, a default gateway server can itself have a default gateway. It also uses ARP in the same way as we saw a moment ago in order to reach the next hop, and pass the packet on to it. Before doing so, however, it decrements the TTL (time-to-live) field in the packet, and if it becomes 1 or 0, discards the packet and sends an ICMP TTL expired in transit message back to the sender. Each hop along the way does the same thing. Also, if the packet came in on the same interface that the gateway's routing table says the packet should go out over to reach the next hop, an ICMP redirect message is sent to the sender, instructing it to bypass this gateway and directly contact the next hop on all subsequent packets. You'll know if this happened because a new route will appear in your web browser machine's routing table.
Each hop passes the packet along, until at the destination the last router notices that it has a direct route to the destination, that is, a routing table entry is matched that is not another router. The packet is then delivered to the destination server.
The destination server notices that at long last the IP address in the packet is its own, that is, it resolves via ARP to the Ethernet address of the server itself. Since it's not a forwarding case, and since the IP address matches, it now examines the TCP portion of the packet to determine the destination port. It also looks at the TCP header flags, and since this is the first packet, observes that only the SYN (synchronize) flag is set. Thus, this first packet is one of three in the TCP handshake process. If the port the packet is addressed to (in our case, port 80) is not bound by a process (for example, if Apache crashed) then an ICMP port unreachable message is sent to the sender and the packet is discarded. If the port is valid, and we'll assume it is, a TCP reply is sent, with both the SYN and ACK (acknowledge) flags set.
The packet passes back through the various routers, and unless source routing is specified, the path back may differ from the path used to first reach the server. The client (the machine running your web browser) receives the packet, notices that it has the SYN and ACK flags set, and contains IP and port information that matches a known PCB. It replies with a TCP packet that has only the ACK flag set.
This packet reaches the server, and the server moves the connection from PENDING to ESTABLISHED. Using the mechanisms of TCP, the server now guarantees data delivery between itself and the client until such time as the connection times out, or is closed by either side. This differs sharply from UDP, where there is no handshake process and packet delivery is not guaranteed, it is only best-effort and left up to the application to figure out if the packets go there or not.
Now that we have a live TCP connection, the HTTP request that started all of this may be sent over the connection to the server for processing. Depending on whether or not the HTTP server (and client) supports such, the reply may consist of only a single object (usually the HTML page) and the connection closed. If persistence is enabled, then the connection is left open for subsequent HTTP requests (for example, all of the page elements, such as images, style sheets, etc.)

Okay, as I mentioned earlier, we will now address how the client resolves the hostname into an IP address using DNS. All of the above ARP and IP information holds true for the DNS query and replies.

The gethostbyname() function must first determine how it should go about turning a hostname into an IP address. To accomplish this, it consults the /etc/nsswitch.conf file, and looks for a line beginning with hosts:. It then examines the keywords listed, and tries each of them in the order given. For the purposes of this example, we'll assume the pattern to be files dns nis.
The keyword files instructs the kernel to consult the /etc/hosts file. Since the web server we're trying to reach doesn't have an entry there, the match attempt fails. The kernel checks to see if another resolution method exists, and if it does, it tries it.
The next method is dns, so the kernel now consults the /etc/resolv.conf file to determine what DNS server, or name resolver, it should contact.
A UDP request is sent to the first-listed name server, addressed to port 53.
The DNS server receives the request. It examines it to determine if it is authoritative for the requested domain; that is, does it directly serve answers for the domain? If not, then it checks to see if recursion is permitted for the client that sent the request.
If recursion is permitted, then the DNS server consults its hints file (often called named.ca) for the appropriate root DNS server to talk to. It then sends a DNS request to the root server, asking it for the authoritative server for this domain. The root domain server replies with a third DNS server's name, the authoritative DNS server for the domain. This is the server that is listed when you perform a whois on the domain name.
The DNS server now contacts the authoritative DNS server, and asks for the IP address of the given hostname. The answer is cached, and the answer is returned to the client.
If recursion is not supported, then the DNS server simply replies with go away or go talk to a root server. The client is then responsible for carrying on, as follows.
The client receives the negative response, and sends the same DNS request to a root DNS server.
The root DNS server receives the query, and since it is not configured to support recursion, but is a root server, it responds with "go ask so-and-so, that's the authoritative server for that domain." Note that this is not the final answer, but is a definite improvement over a simple go away.
The client now knows who to ask, and sends the original DNS query for a third time to the authoritative server. It replies with the IP address, and the lookup process is complete.

A few notes on things that I didn't want to clutter up the above narrative with:

When a network interface is first brought online, be it during boot or manually by the administrator, something called a gratuitous ARP request is broadcast. It literally asks, "who has 1.2.3.4? Tell 1.2.3.4." This looks redundant at first glance, but it actually serves a dual purpose: it allows neighboring machines to cache the new IP to Ethernet address mapping, and if another machine already has that IP address, it will reply with a typical ARP response of "Hey 1.2.3.4, 1.2.3.4 is 8:0:20:4:3f:2a." The first machine will then log an error message to the console saying "IP address 1.2.3.4 is already in use by 8:0:20:4:3f:2a." This is done to communicate to you that your Excel spreadsheet of IP addresses is wrong and should be replaced with something a bit more accurate and reliable.
The Ethernet layer contains a lot more complexities than I detailed above. In particular, because only one machine can be talking over the wire at a time (literally due to electrical limitations) there are various mechanisms in place to prevent collisions. The most widely used is called CSMA/CD, or Carrier Sense Multiple Access with Collision Detection, where each network card is responsible for only transmitting when a wire clear carrier signal is present. Also, should two cards start transmitting at the exact same instant, all cards are responsible for detecting the collision and reporting it to the responsible cards. They then must stop transmitting and wait a random time interval before trying again. This is the main reason for network segmentation; the more hosts you have on a single wire, the more collisions you'll get; and the more collisions you get, the slower the overall network becomes.

Lin u X u niL

Monday, December 28, 2009

What happens when you browse to a web site

google Lin u X u niL

Free Blog Promotion