mark nottingham

Bug Syncronicity

Thursday, 13 April 2006

HTTP

I’ve had a lyric running through my head for the last day or so, thanks to a couple of bugs.

I am thinking it’s a sign that the freckles / In our eyes are mirror images and when / We kiss they’re perfectly aligned — The Postal Service, Such Great Heights

Let me explain.

Apache

Apache’s ap_meets_conditions in http_protocol.c is responsible for handling conditional HTTP requests. If you send Apache an If-Modified-Since, this code will figure out whether or not it’s been modified since the time you last saw it.

It does this by comparing the time in the IMS header to the modification time — mtime in the code — of the resource.

if ((ims >= mtime) && (ims <= r->request-time)) [
            return HTTP_NOT_MODIFIED;
        }

This works fine usually, but if the resource doesn’t have any notion of what a modification time is, Apache will fake it by using the current time.

mtime = (r->mtime != 0) ? r->mtime : time(NULL);

This sway, if the If-Modified-Since time is now, Apache will automatically do the right thing (since IMS-based validation has a resolution of one second).

The only way this could be a problem is if a spurious If-Modified-Since is sent with the current date in it, in which case Apache will send a 304, even if the resource doesn’t even have a concept of modification times. But that isn’t a problem, because browsers won’t send an If-Modified-Since unless the resource gave them a Last-Modfied header previously.

Right?

Safari

That’s all fine and good, but browsers are not the most well-behaved members of the HTTP bestiary. In the past, there’s been a lot of confusion about the Last-Modified header, as well as the Date header, but AFAIK that was cleared up in the late 90’s.

Then came Safari.

It turns out that if Safari doesn’t see a Last-Modified header from a resource, it’ll send an If-Modified-Since header with the value of the Date header in it.

Don’t believe me? Check it out with tcpflow, or go to this simple test page that just displays the value of the request’s IMS header. The first time you go there, you’ll see “-“, and if you reload, you’ll see the same thing. However, if you then open a new window, paste the URI into its address bar and press return, you’ll see the IMS header.

The mind boggles. I can’t imagine why the Apple guys thought this was a good idea, given that it’s forbidden by HTTP.

Why does this matter?

The Apache issue is fairly innocuous; they made an implementation decision to trust the browser not to be stupid. However, in combination with the Safari bug, it’s a problem; I think it essentially creates a race condition across the network.

This means that, for example, CGI scripts which have no concept of Last-Modified will, under some conditions, be returning a 304 Not Modified to Safari. If the state of your resource changes more than once a second — say, you’re doing some really fancy AJAX stuff — this would effectively throttle it to only appear to change once a second.

Is that a common case? Probably not (although I tripped across it doing exactly that), but it’s likely there are other side effects of this frankly bizarre behaviour that aren’t apparent now.

When I saw this, my eyes nearly fell out of their sockets, and I was sure I was doing something wrong, but tcpflow doesn’t lie. That said, it’s hard to reproduce the Apache behaviour, and I’d like to see independent confirmation of both problems.

I’ve filed a bug with Apache; it details how to reproduce the behaviour (and I’d love to see some independent confirmations). I tried to file one for Safari with Bugreporter as well, but it appears to require a full ADC membership now, not my chintzy free one. So, I used the press-and-pray Safari bug report menu item.

Hopefully, this will get fixed soon; in the meantime, don’t assume that anything on the Web can be updated more than once a second.


3 Comments

Andrew Sidwell said:

You may be better off reporting this to the Safari team as described in http://webkit.opendarwin.org/quality/reporting.html , as they seem to have their own Bugzilla around.

Friday, April 14 2006 at 3:19 AM

Adam Ratcliffe said:

I recently ran into this same problem with a resource that should be non-cacheable. Sending the following response headers without including a Last-Modifed header results in Safari sending an If-Modified-Since on subsequent requests.

Expires: Mon, 26 Jul 1997 05:00:00 GMT Cache-Control: no-cache, must-revalidate Pragma: no-cache

Saturday, August 19 2006 at 4:22 AM