Internet Protocols at a Glance

There is no way that we could hope to cover IPs completely in a book about firewall diagnostics. It's a huge subject and has been covered in absolutely fantastic detail by the late great W. Richard Stevens in the books, The Protocols TCP/IP Illustrated Volume 1 and The Implementation (TCP/IP Illustrated, Volume 2), coauthored by Gary R. Wright. A basic understanding of TCP/IP and how it fits with the concept of firewalling, however, is within the scope of this book. If you want to know more about TCP/IP, please check out the fine books mentioned here, and if you already have a firm grasp of this subject, then you can skip over some of this material. We provide it here as a brief introduction for the reader.

Understanding the Internet Protocol (IP)

IP is a Layer 3 protocol (Network Layer) and is documented fully in RFC 791. IP traffic contains routing and address information and is the medium by which packets traverse the Internethence, the rather obvious name, "Internet Protocol." This, combined with the Transmission Control Protocol (TCP), forms the backbone of how most services on the Internet function. Aside from IP's routing duties, this is also where the reassembly of packets (which is what we call a single unit of IP, TCP, ICMP, or UDP data) occurs when IP traffic moves through devices that dictate the Maximum Transmission Unit (MTU) size.

Briefly, the MTU defines how big of a packet a particular device can handle. With fast connections, this value isn't nearly as important because it's almost always set to the maximum size possible. MTU is mostly an issue with slower connections, such as modems, where the maximum packet size is much smaller. MTUs become important when packets are lost and have to be retransmitted. If a packet is very large and only a tiny portion of the packet was really lost, the entire packet must be retransmitted. The idea behind smaller MTUs is to help prevent resending too much data.

Figure 5.2. The Internet Protocol Packet.

What an IP Packet Looks Like

There are 14 fields in an IP packet. They are (from left to right):

Version: The version of the IP packet, currently either IPv4 or IPv6.
IP Header Length (IHL): The datagram header length, in 32-bit words.
Type-of-Service (TOS): Specifies how an upper layer protocol (Layer 4+) should handle the packet and the importance of the packet.
Total Length: The length of the IP packet in bytes.
Identification: An integer that identifies the datagram, used for fragment reassembly.
Flags: 3-bit field, which designates the fragmentation information. The first bit specifies if the packet can be fragmented. The second bit specifies the last fragment in a series of fragments. The third bit is unused.
Fragment Offset: Specifies what position the fragment is in relation to the beginning of the datagram.
Time-to-Live (TTL): This is a counter that as the traffic moves between IP devices is decremented. When this counter reaches zero, the packet is discarded. The TTL counter is used to prevent IP traffic from looping.
Protocol: This designates what upper layer protocol this packet is destined to after the IP layer routing information has been processed.
Header Checksum: This is used to ensure that the IP packet has not been damaged in transmission.
Source Address: Referred to as SRC in many netfilter/iptables logging rules and sniffers like tcpdump and ethereal. This is the address field in the packet that designates what system sent the packet.
Destination Address: Referred to as DST in many netfilter/iptables logging rules and sniffers like tcpdump and ethereal. This is the address field in the packet that designates where the packet is going.
Options: Allows IP to support options like IPSEC.
Padding.
Data: Contains the upper layer information, like a TCP or UDP packet.

A word of cautionall of the headers in an IP packet can more or less be manipulated by any party between two points. This means that you shouldn't trust the fields in an IP packet arbitrarily. You can attempt to encrypt the traffic between two points, but if those two points are using IP, the headers on those packets can still be changed by an attacker. The bottom line is not to trust the headers of a packet by themselves.

Understanding ICMP

The Internet Control Message Protocol (ICMP) is a quasi Layer 3 protocol (Network), documented by RFC 792 and RFC 1700. ICMP is used to pass IP packet error and processing information between IP devices. We call this information an "ICMP Message." A common ICMP packet is the "ping" packet, but there are many others. Some of them are not necessary anymore, and some are really critical to a healthy network.

ICMP packets are somewhat unique in the world of TCP/IP. Because ICMP cannot live in Layer 3 all by itself, it's actually contained inside of an IP packet as documented here, much like the Layer 4 TCP and UDP packets documented in the following sections. Also unique to the ICMP packet is that the data portion of the packet does not contain the "ICMP Message." Rather, the "Message" component is in the Type and Code fields as documented in Table 5.1.

Table 5.1. Referenced from http://www.onlamp.com/pub/a/bsd/2001/04/04/FreeBSD_Basics.html?page=1
Type	Name	Code	Note
0	Echo Reply	0 - none	Ping reply
1	unused
2	unused
3	Destination Unreachable	0 - Net unreachable
3	Destination Unreachable	1 - Host unreachable
3	Destination Unreachable	2 - Protocol unreachable
3	Destination Unreachable	3 - Port unreachable
3	Destination Unreachable	4 - Fragmentation needed and DF bit set
3	Destination Unreachable	5 - Source route failed
3	Destination Unreachable	6 - Destination network unknown
3	Destination Unreachable	7 - Destination host unknown
3	Destination Unreachable	8 - Source host isolated
3	Destination Unreachable	9 - Communication with destination network is administratively prohibited
3	Destination Unreachable	10 - Communication with destination host is administratively prohibited
3	Destination Unreachable	11 - Destination network unreachable for TOS
3	Destination Unreachable	12 - Destination host unreachable for TOS
4	Source quench
5	Redirect	0 - Redirect datagram for the network
5	Redirect	1 - Redirect datagram for the host
5	Redirect	2 - Redirect datagram for the TOS and network
5	Redirect	3 - Redirect datagram for the TOS and host
6	Alternate host address	0 - Alternate address for host
7	Unassigned
8	Echo	0 - None	Ping packet
9	Router advertisement	0 - None
10	Router selection	0 - None
11	Time Exceeded	0 - Time to live exceeded in transit
11	Time Exceeded	1 - Fragment reassembly time exceeded
12	Parameter problem	0 - Pointer indicates the error
12	Parameter problem	1 - Missing a required option
12	Parameter problem	2 - Bad length
13	Timestamp	0 - None
14	Timestamp reply	0 - None
15	Information request	0 - None
16	Information reply	0 - None
17	Address mask request	0 - None
18	Address mask reply	0 - None
19	Reserved (for security)
20-29	Reserved (for robustness experiment)
30	Traceroute
31	Datagram conversion error
32	Mobile host redirect
33	IPv6 where-are-you
34	IPv6 I-am-here
35	Mobile registration request
36	Mobile registration reply
37-255	Reserved

Figure 5.3. The ICMP Packet.

What an ICMP Message Looks Like

The two fields of real importance with ICMP are the Type and Code fields. There are technically 255 ICMP message types, although presently only 34 are used, of which there are also several sub-types or codes for each of those 34 types. To put it succinctly, there are many ICMP messages out there. As stated previously, only some of them are necessary to keep your network healthy; the others you can and probably should learn to live without.

Understanding TCP

TCP stands for the Transmission Control Protocol and is situated in Layer 4 of the OSI model. It is a connection-oriented service and handles the transfer of data, flow control, reliability, and multiplexing all in one protocol. TCP is a very robust protocol and can handle most error conditions with automated recovery. TCP is a good protocol for higher-level protocols, such as HTTP and SMTP, which do not have built in error recovery and flow control capabilities. But TCP is not ideal for protocols that handle this internally, such as VPN protocols, which are tunneling TCP connections within. With VPNs, UDP is usually the better protocol to use. We discuss this in more detail later in this chapter when we cover UDP.

There are 13 fields in a TCP Packet. They are (from left to right):

Source Port: The port number the packet was sent from on the source machine.
Destination Port: The port number the packet is being sent to on the remote machine.
Sequence Number: Either an initial number sequence used in the upcoming transmission or the number assigned to the first byte of data in the current message.
Acknowledgment Number: The sequence number for the next byte of data.
Data Offset: The number of 32-bit words in the TCP header.
Reserved: Set aside for future use.
Flags: Control information for the packet. SYN, ACK, FIN, RST, URG, and PSH flags are part of this field.
Window: Size of the sender's buffer space available for incoming data.
Checksum: A checksum of the header to determine if it was damaged during transmission.
Urgent Pointer: Location of the first urgent data byte in the packet.
Options: TCP options.
Padding: The padding field is used when alignment is needed or sometimes when used with encryption, such as IPSEC.
Data: The actual message or content being sent by the TCP packet.

TCP provides many additional capabilities, such as reliability, efficient and varied methods of flow control (Linux has the capability to modify this further through /proc as detailed in later chapters), full-duplex communication, multiplexing, and streaming.

With streaming, TCP can deliver bytes in an unstructured form identified by sequence numbers. This is used when an application does not or cannot break data into blocks that fit efficiently on the network. TCP is left to work this out for the application. In these cases, TCP groups the packets together into what are called sequences, based on internally determined maximum sizes based on network conditions and the way the system is configured.

Reliability

TCP provides reliability to data through full duplex connections. Unlike UDP, TCP packets always receive acknowledgment from the receiver, or they are sent again until acknowledged. TCP accomplishes this by performing what is called "forward acknowledgment" by adding in an Acknowledgment number for the next byte of data the source expects the recipient to receive. A specified amount of time is tracked by the sender, and if the block of data is not acknowledged within that period of time, the packet is sent again to the receiver. This makes it possible for TCP to respond to lost or damaged packets reliably. The receiver may also request that a packet be resent.

Full Duplex and Multiplexing

TCP is truly full duplex, which means that the protocol can be used to both send and receive data at the same time. TCP also can be used to perform multiplexing, which means that many simultaneous connections and conversations can be conversing over one connection. This is accomplished via upper layers of the OSI model through higher-level protocols, such as SMTP and HTTP for instance.

Flow Control

Flow control is another feature of TCP. Flow control is defined as a technique used to stop the sender of data from sending more data than the receiver can accept. This is achieved through the use of sequence numbers, where the receiver sends back the highest sequence number it can receive without exceeding its receive buffers. The sender is supposed to transmit packets up to but not exceeding that sequence number, and then the sender is to wait until the receiver sends another ACK packet with a higher sequence number. There are a number of algorithms to achieve this, many of which are configurable through /proc under Linux. More detail on this is provided in later chapters.

Congestion Control

TCP also provides for what is called congestion control by adapting to network is designed to operateconditions to slow down or speed up the rate at which packets are sent. This is to respond to the "common" reality of IP networks, where guaranteed flow rates are difficult to accomplish and packets are lost due to overloaded routers, switches, hubs, hosts, and any other device in the path.

How TCP Connections Are Established

TCP, unlike UDP, is designed to operate via fully established connections. It is not a broadcast protocol, but a true three-way communications protocol. Nothing is assumed with TCP until both the receiver and sender acknowledge the communication. To do this, TCP uses a three-way handshake to establish a new connection and to synchronize the hosts on both ends to each other's capability to receive and send data. If you recall the sequence numbers discussed earlier, this is how TCP accomplishes flow control and readies both sides to send and receive data at that established rate. This helps to prevent unnecessary packet retransmissions.

As each host starts the process, it is supposed to randomly pick a sequence number. It's worth noting here that not all random number generators are the same. Some OSs, including some Linux kernels, may not pick sequence numbers that are truly random. This may seem unimportant, but there are a number of attacks on TCP streams that are accomplished by predicting the next sequence number and spoofing the next packet in a stream to "hijack" the stream. A good random number generator for your sequence numbers helps to protect against these sorts of attacks and also makes certain types of DoS and DDoS bounce attacks more difficult against your systems. Thankfully, there are patches for the Linux kernel that make the sequence numbers and other random numbers used by the IP stack truly random.

After the hosts have picked random numbers to use as their sequence numbers, the host that is initiating the connection, Bob, sends a TCP packet to the receiver, Alice, with the initial sequence number, XY, and the SYN flag set in the header of the packet. Alice, the receiver, then processes the packet, records the sequence number XY, and sends back a packet to Barrett with the acknowledgment field set to XY+1. The receiver increments the original sequence number by 1. Alice also adds its own initial sequence number to the packet, Z, and sets the ACK flag in the headers of the TCP packet. Each host then increments these sequence numbers by the number of bytes of data it has successfully received from the other host. This acts as a mechanism for each host to limit the amount of data sent in each packet to only the data the other party needs and can receive.

Figure 5.4. The TCP packet.

Figure 5.5. Three-Way Handshake (TWH).

How TCP Connections Are Closed

TCP is as equally methodical about closing down connections as it is about creating them. TCP sessions can be closed in two ways. The first is very similar to the way in which a connection is established, via a four-way handshake referred to as a "close." The second method is called a TCP abort and uses a special packet flag called the reset, or RST flag.

A close request using the four-way handshake method starts with the party that wants to close the connection by sending a FIN+ACK packet to the other party. This is the most graceful manner of closing a session and does not represent any sort of error state on the requester's end. We point this out because a TCP abort, the second method, is normally invoked when one of the parties experiences an error and wishes to close the connection in an error state. We have seen some vendors that use the RST method a little too prodigiously, which causes some high level applications that depend on this "niceness" to infer an error state from the state of the connection. If the close method is used, many programs assume the connection was successful; if the abort method is used, they infer, rightly in fact, that something went wrong. One fascinating example of this occurs with some high-level web applications that look for an error state from the TCP connect call, and when a TCP abort occurs, the web application assumes the entire HTTP connection failed and will simply retry the connection again. We found this problem at one large government customer of ours. They had an XML/HTTP application designed to send messages to another host running a web server. The client would send the data via an HTTP POST, the server would acknowledge the successful receipt of the data way up at Layer 7, and meanwhile down at Layer 3, a low balancing switch killed the session part way through the close process with an RST packet. The client inferred, incorrectly, that the server had not received the data correctly. Technically speaking, this was one correct manner of interpreting the TCP ABORT call.

Regardless, our client's application assumed it needed to resend all the data again because all the data it got back from the server saying that it successfully received the data was discarded due to an error state created by the TCP ABORT call. In fairness, the program on the client could have been better written to take these exceptions into account, perhaps by simply looking at the HTTP data and ignoring the TCP error state. Nevertheless, the problem was so low in the OSI model that our customer spent months trying to figure out why their application kept randomly attempting to resend the same messages over and over again when the server had acknowledged them already.

TCP CLOSE

The TCP CLOSE is accomplished, as we have already discussed, via a four-way handshake. Unlike establishing a connection, to close a session, both hosts must agree to the request.

One host cannot arbitrarily close the connection via this method. The process starts with the party that wishes to close the connection sending a FIN packet. The other host, if it agrees to close the session, sends back a FIN+ACK packet. The other host also must send its own FIN packet to the first host, which must also send back a FIN+ACK packet. The process is basically a complete mirror on both ends. This is done so that both sides can empty their buffers of any remaining data. Until the final FIN is sent, data can still flow from the party that did not initiate the CLOSE request. This means that connection can be half closed, meaning that one side has stopped sending data and is waiting for the other side to stopwhile the other side is still sending data.

TCP ABORT

A TCP ABORT, or RST packet, is sent normally when data has been lost due to an unrecoverable error. This method of closing a TCP session can be used arbitrarily to shut a connection down. Both hosts do not have to agree to close the session.

Figure 5.6. Three-way teardown.

There is much more to TCP than we have covered here, so don't consider this a complete explanation. For instance, we haven't covered the scenario of TCP windows or the various methods used to recover from packet loss. Other books cover those issues in wonderful detail; our objective is to briefly cover those aspects of TCP that are most important to troubleshooting firewalls. We also will assume, in later portions of the book, that the reader has a more in-depth understanding of TCP if the subject matter requires it. Where this assumption exists, we will point the reader to other books on the subject.

Understanding UDP

User Datagram Protocol (UDP) is the other "big" protocol used on the Internet. Unlike TCP, it is entirely connectionless, but like TCP, it's also a transport layer protocol (Layer 4) and is part of the Internet Protocol. Also unlike TCP, UDP has no inherent flow control, error recovery, or reliability capabilities. If a UDP packet is lost, the receiving host has no way of knowing or reporting this as part of the native functionality of UDP. Because of this, some applications that use UDP have implemented their own internal functions to compensate. This is not to say that UDP is bad protocolfar from it. Sometimes a developer does not need the additional functionality of TCP, or the flow control characteristics of TCP may cause problems such as with VPNs. There are advantages to UDP because of its simpler design, as already indicated, but another is that because UDP packets tend to be smaller, they will use less bandwidth. This can be necessary for protocols that consume large amounts of bandwidth, such as file transfer protocols, VPNs, or voice-over IP.

UDP packets contain five fields, the first four of which are the UDP header. The header is made up the following fields: source and destination ports, length, and checksum. The body of the message, also known as its data or payload field, contains the actual message or data being sent via the UDP protocol. The checksum field is optional and can be used to provide some integrity check on the UDP header and data fields, but is not required.

Figure 5.7. The UDP packet.

Troubleshooting with this Perspective in Mind

Aside from the obvious benefits of using this bottom-up methodology, we have found that when you don't really know where to start, that looking at the problem through this lens can be very helpful at quickly ruling out elements that may not be causing the problem. It's also amazing how often these simple problems end up being the cause of what appears to be a complex problem. Following are some undoubtedly oversimplified but really common problems we have seen happen far too often to people who should know better by nowto help illustrate the things to look for at these layers of the OSI model. The point being that sometimes, you're just over-thinking the problem. Also note how the "quick checks" get more complicated the farther you go down. Start with the simple things first! The following Scott one-liners on each layer help illustrate the point.

Layer 1: Test To Make Sure You Have Physical Connectivity

One-liner: "You kicked your patch cable out." Scott

This is one of the simpler things to test but often goes overlooked. What you want to look for is not just that you have a link light, which is also important, but that the cable is properly terminated, is it in spec for the connection it's being used for (CAT3 when you need CAT6), and other physical problems. The following is a general list of things to look for; the key thing to keep in mind is that you want to rule out physical connectivity issues before you move on to more difficult problems.

Are you plugged in?
Is the cable actually any good? Is it in spec?
Is the port on the switch/hub working? Try another port to see if that fixes your problem.

Layer 2: Test Your Driver

One-liner: "That's not a Tulip card." Scott

10/100/1000/10000 issueare you actually running the right one? Some cards use different drivers based on speed or need to be passed various switches to configure them to work at different speeds, full duplex, half duplex, etc.
Is the MTU size correct for your network? For instance, the MTU size on some DSL connections that use PPPoE actually matters to successfully process large packets. Hint1500 is good for cable modems, Ethernet, and T1+ connections, but when you're dealing with VPNs (Virtual Private Networks) over DSL and ATM, you'll probably have to back this off in increments until you find the right setup for your network.
Is that network really Ethernet? Is it Token Ring? Is it something else?

Layer 3: Test IP Layer

One-liner: "That's not my IP." Scott

Ping it firstfrom your box and another box.Traceroute.
Packet sniff itare you getting packets back? Is anything going out? Are the IP addresses correct? For instance, NAT rules can change the IP addresses of the source and destination hosts. If this is done incorrectly, or not at all, the source address being sent out could be incorrect.

Layer 4: Test the TCP Layer

One-liner: "I'm not running a web server on that machine." Scott

Telnet to the port, from in front of and behind the firewall. A great way to waste a lot of time is to diagnose a firewall by trying to connect to a service that isn't actually running on the other end. A simple "telnet somehost.com 25" (this would connect to the mail port) can help save you some time.
Test the service without the application. In other words, if you're testing a mail server rule, try testing the connection with a command line telnet before sending mail.
Test with the application last.

Layer 5: Test the Session Layer

One-liner: "Wait for the login prompt." Scott

The session layer is responsible for coordinating the connection(s) from an application to a networked system and vice versa. Logins, Name resolution, etc. generally are depicted as happening at this level. So it helps sometimes to be patient. For example, some mail servers will do an identd request when you connect. Until they receive the response, they won't pass you up to the next layer. The connection will appear to stall. From your perspective it might appear as if something is wrong given that it's not responding right awayif you weren't running the identd service.

Layer 6: Test the Presentation Layer

One-liner: "That doesn't actually support your authentication system." Scott

Test to make sure what you're using actually supports what you're trying to do when it comes to the presentation layer. A good example of this is the myriad IPSEC implementations. Generally getting them to communicate down in Layers 3 and 4 is pretty easy; it's when you get up to the authentication and encryption in Layer 6 that things get weird.

Layer 7: Test the Application Layer

One-liner: "Try using the right password." Scott

Make sure you're actually entering the right login information or URL to that website you can't get to before you blame the firewall. Is that site even up?
Is the remote application working from somewhere else? Try connecting to the site from some host that is not affected by the firewall.

The lesson to take away from this approach is to conclusively eliminate dependencies. This isn't a complete list of things to check. It's just a set of examples to help remember the layers and what to look for as you work through them. The chapters in Section 3 of the book have more detailed lists of items to test at each layer for specific problems, but you may determine that there are others specific to your implementation that we may not cover.

Remember, if you cannot eliminate a layer and must move onto another layer, keep in mind the unresolved problems from the previous layers. It could be a more complex problem. The intent is to eliminate variables to make it possible to work on increasingly complex layers. As you move up the OSI model, the number of dependencies increases dramatically; if you don't rule out a layer, you're making it that much harder to diagnose the next layer. Repeat and live by the old adage, "keep it simple," when you're troubleshooting and you'll save yourself a lot of time by reducing the variables in your problem.