mark nottingham

The Pitfalls of Debugging HTTP

Thursday, 22 May 2008

HTTP

Some folks at work were having problems debugging HTTP with LWP ’s command-line GET utility; it turned out that it was inserting Link headers — HTTP headers, mind you — for each HTML <link> element present.

Blurgh.

This brought to mind some other peculiarities that can make debugging HTTP more complex and REALLY ANNOYING…

Curl automagically adds a Pragma: no-cache to requests, so that you don’t have to worry about scaling the Web or getting decent performance.

LiveHTTPHeaders shows you headers, but through the lens of Mozilla’s header parsing and processing, not what’s on the wire.

Even Wireshark can’t be completely trusted; it will remove the \r\n from the Content-Length header (because it’s parsing the message for its own purposes), making you think that the sender has header delimitation bugs.

Anybody have another?

All of this is why I tend to use telnet when debugging HTTP. Sometimes tcpflow helps too.


25 Comments

Steve Clay said:

Another annoyance? Like Firebug’s Net tab (supposedly much improved in upcoming release).

Fiddler (win) is top-notch. Works out-of-the-box with IE/Safari, easy proxy autoconfig for Opera/FF.

Thursday, May 22 2008 at 10:36 AM

integralsource said:

I use http://www.charlesproxy.com/ (linux, os x, windows), I haven’t noticed it doing anything unusual yet.

Thursday, May 22 2008 at 10:52 AM

Jim Dabell said:

I use telnet a lot too, but mostly out of habit. Netcat is more flexible.

Thursday, May 22 2008 at 11:19 AM

PJ said:

tcpdump FTW!

Friday, May 23 2008 at 1:33 AM

Mike Amundsen said:

I like WFETCH on the Windows platform. has it’s limits (not scriptable), but it’s solid.

Friday, May 23 2008 at 1:36 AM

chrism said:

TCPWatch (in Python): http://hathawaymix.org/Software/TCPWatch

Friday, May 23 2008 at 2:09 AM

Edward O'Connor said:

I use Luke Gorrie’s excellent http-twiddle.el:

http://lukego.livejournal.com/6154.html

Friday, May 23 2008 at 3:12 AM

ryan king said:

tcpflow all the way, usually because I’m usually debugging both the client and the server at the same time.

Friday, May 23 2008 at 4:18 AM

Rob Russell said:

I just discovered netcat yesterday. There was one from sysinternals.com before Microsoft bought them - TcpView. It looks like it’s still there.

Friday, May 23 2008 at 4:24 AM

ed said:

ngrep

Friday, May 23 2008 at 6:29 AM

Thor Larholm said:

Fiddler is very nice and does not mess with your headers. It gives you the option of seeing the requests and responses raw or in various states of parsedness.

Friday, May 23 2008 at 6:38 AM

Andy Lester said:

ncat works, too.

Heck, telnet!

Friday, May 23 2008 at 9:04 AM

Adam Wiggins said:

Yeah, I noticed this same thing, and ended up making a Ruby gem for it. Used at an irb prompt, it makes a great replacement for curl. http://adam.blog.heroku.com/past/2008/3/9/rest_client/

Friday, May 23 2008 at 11:00 AM

Daniel Yokomizo said:

My current favorite is ngrep, as it can be used without messing with the client.

Friday, May 23 2008 at 12:01 PM

asenchi said:

I have this function in my .bashrc:

function getheaders { site=$1 echo -e “HEAD $site HTTP/1.0\n\n” | nc $site 80 }

Of course you can change the request. It’s a great way to test a web server.

Friday, May 23 2008 at 12:22 PM

Neda said:

Tamperdata FTW!

Saturday, May 24 2008 at 1:09 AM

Aristotle Pagaltzis said:

Netcat is a better choice than telnet. But recently I discovered socat – http://www.dest-unreach.org/socat/ – which is netcat on double steroids. Like netcat, it deals in raw wire data, but socat also gives you the option use full readline-based history instead of a bare STDIN, and it can interconnect all sorts of different transports, with native SSL support, and… really, it’s cooler than I can convey in a few sentences. It’s a distributed computing hacker’s dream tool. Take a look.

Sunday, May 25 2008 at 1:06 AM

Etan Wexler said:

I was surprised to read about Wireshark’s misbehavior (misbehaviour in Aotearoa). Then again, you’re now aware of this bug ( http://bugs.wireshark.org/bugzilla/show_bug.cgi?id=2534 ) in Wireshark 1.0.0. That is, you should expect the absence of “\r\n” at the end of the display of the “Content-Length” header-field. If you find something unexpected, you can always drop down a level, as the bug report mentions. In my installation of Wireshark 1.0.0, I have but to click on the display of the “Content-Length” header-field and Wireshark automatically highlights the display of the corresponding bytes in the pane below. Sure enough, there is the expected “0d 0a” at the end. Wireshark 1.0.0 is at least consistent in its bugginess, making knowledge of the bug the sole requisite for working around the bug. (For the record, I favor a solution in the source code. I’m just noting the benign nature of things as they stand.)

Monday, May 26 2008 at 7:33 AM

Sylvain said:

I think Charles Proxy (http://www.charlesproxy.com/) is one of the best tool around for HTTP Debugging. It’s unfortunately not free, but is worth your money.

Monday, May 26 2008 at 12:23 PM

damien wetzel said:

Hi Mark my best friend is tcpflow -cvs |httpflow.py

Tuesday, June 3 2008 at 7:44 AM

Jonas Galvez said:

I have a bit of a rant (and solution) about this:

http://jonasgalvez.com.br/log/#2008-03-25T00:29:02-03:00

Sunday, June 8 2008 at 9:18 AM

Fabrice Medio said:

Apache Axis ships with a logging http proxy called ‘tcpmon’. The primary intent was to debug SOAP frames, but you can use it with anything that flows over http.

Tuesday, June 10 2008 at 2:49 AM

Chaitanya Gupta said:

I prefer the Common Lisp library, Drakma (http://weitz.de/drakma). Greatly eases debugging via the Lisp REPL.

Another good choice is Luke Gorrie’s http-twiddle.el, as someone has already mentioned in the comments.

Tuesday, July 8 2008 at 5:42 AM

Levi "Karatorian" Aho said:

I use nc (net cat) for this sort of thing. I doesn’t parse stuff, so you can see exacltly what got sent. You can also write some basic shell scripts around it to automate stuff. (un*x for the win)

Tuesday, July 15 2008 at 8:54 AM

hassy said:

As others have already said, Luke Gorrie’s http-twiddle is excellent. I’d just like to add that the most up-to-date version now lives on https://github.com/hassy/http-twiddle

Friday, September 18 2009 at 10:45 AM