So the other day I was chatting with some technical (geeky!) friends of mine and the conversation turned to TCP flags. So the question was posed in the group, "What happens if you connect to a remote host and you get a connection refused message?"
Well *one* short answer is the remote side reset the connection, most likely because the particular application or port is not listening on that particular port.
That is to say that when you connect to a Web server on port 80 the connection (the operating system's TCP/IP stack that is running the web browser, or a command line utility, etc) will go through the standard TCP handshake to connect to that well-known port number 80.
Say you connect to port 73, and there is no application running at that IP address and port, the correct thing for that remote host to do is to reply with a packet with the RST flag set, resetting the connection, which is better than the other side just waiting until it times out and gives up.
In layman's terms is basically saying "There is no one here, please go away and try somewhere else".
This is simply one example, more details on TCP (and the RST flag) can be found in this RFC:
Well, so what, why is this important?
Well in the course of troubleshooting it will become important as the following example illustrates:
I was on my customer's network troubleshooting a network connectivity problem.
Basically they were trying to index content on their Web server secured with SSL (https) and they're having a heck of a time with connections from their internal network to the host.
I didn't have any thorough knowledge of their network topology, nor did the customer at the time, however I did have my Linux host and the tcpdump command at the ready so I was able to do a packet capture and then some analysis on the results.
But before that I did some basic tasks:
I did a simple telnet to port 80 to see if there was any Web server running.
Sure enough there was - and when I gave it an HTTP GET request for the / document, it redirected me to port 443. Good to know.
Port 80 is not necessarily directly related to the Web server and its associated port 443 but at least I can confirm to basic IP and TCP connectivity to that host.
So off now to port 443:
First I also issued a telnet to port 443.
That basically didn't give me any real information, nor will it, generally, because the telnet command doesn't really understand the SSL program or protocol. Actually it doesn't at all. For example:
#telnet www.blogger.com 443
Trying 74.125.19.191...
Connected to www.blogger.com (74.125.19.191).
Escape character is '^]'.
Connection closed by foreign host.
All this did was check to see if port 443 was open and listening, which it was, or if it quickly reset, or gave any other information. (plus it's the shortest command to type and remember).
So the next step in troubleshooting would be to use something like the OpenSSL utilty, which is basically an SSL enabled telnet. It's actually quite a bit more than that but for our purposes we're going to connect to the host on port 443 and see what the responses, and then issue some HTTP commands.
However what I found was this response from OpenSSL:
#openssl s_client -connect customerweb:443
CONNECTED(00000003)
write:errno=104
Curl didn't fair much better:
#curl -i "https://customerweb/"
curl: (35) SSL: error:00000000:lib(0):func(0):reason(0)
As you can see neither output was very helpful and clearly doesn't indicate success.
What I would be expecting is something more like this from openSSL:

As you can see the OpenSSL command has connected to the secure blogger website and retrieved some certificate information. This confirms that's a secure Web server at the other end is able to be contacted and we can read the certificate.
So I forwarded the errors and information to the customer and suggested for their network team to look into the network hardware in between these two devices, the first one which incidentally was a Google Search Appliance, which I work often times with.
That's fine and helpful but not really getting to the heart of the issue.
We need to once again check in with TCPdump and really see what's going on a packet level. So I did and this is what I found:
The Google search appliance was making a valid connection to port 443 however the remote host was waiting for a few seconds and then coming back with an RST flag set in the packet, basically resetting the connection, as if to say note is no Web server listening here as shown in the packet trace:

I saw this using Wireshark, which is a great tool for analyzing tcpdump files. It is actually able to resolve the first 3 bytes of the Mac address to the the vendor who has been assigned that block of MAC addresses, which can be really helpful and a time saver.
In this case the default gateway was in fact a device manufactured by the F5 Corporation. Given that, our picture looks a little more clear:
"GSA"->"F5/LB"->"HTTPS Customer WebServer".
Again without even knowing about the full network topology it was clear that the path the Google Search Appliance was going through over network there definitely was F5 device (most likely a firewall or load balancer), just based on the MAC address of the default gateway for the GSA.
The resolution of this in fact was a problem with the F5 load balancer, which taken care of with an update to its configuration.
But here is a good example of starting from what you know, what is available to you, and just knowing a little bit about TCP flags, and how they're used to get closer to a solution.



0 comments:
Post a Comment