Friday, 23 July 2010
Since SPDY has surfaced, one of the oft-repeated topics has been its use of TLS; namely that the SPDY guys have said that they’ll require all traffic to go over it. Mike Belshe dives into all of the details in a new blog entry, but his summary is simple: “users want it.”
I don’t think it’s that simple.
Trust
I trust my ISP, to a point; I have a business relationship with them, so I don’t worry too much about them doing traffic analysis on what I surf and when I surf it. Likewise, they have a business relationship with their transit providers, and so on, right on to the Web sites I surf. Sure, it might go through a peering point or two, but the fact is that end to end, there is a series of trust relationships that are somewhat transitive; it’s how the Internet — a network of networks — works.
These relationships work pretty well; the Internet has been routing around technical and not-so-technical problems for a long time now. And, looking at the threat profile of the modern Web, this is borne out; the vast majority of attacks on the Web are on the endpoints; either in the browser, on the OS, or on the server, or some combination of these.
Let’s replay that; the vast majority of vulnerabilities and actual issues on the Web will not be improved one bit by requiring every Web site in the world to run TLS.
I’m not saying man-in-the-middle attacks are non-existent, but changing the entire Web to run over SSL/TLS is a drastic move, and we need solid, well-defined motivation for making such a big change. People look at me like I’m crazy when I talk about having a Web without JavaScript, but I’d wager any amount of money it’s the lynchpin in several orders of magnitude more loss (whether you’re counting in dollars or units of personally identifying information) than man-in-the-middle attacks.
However, I can imagine there are a few situations where allowing the user, rather than the server, choose whether to use SSL might be helpful.
- If I’m accessing the Web over an untrusted wireless connection, I probably don’t want even the more innocuous traffic overlooked; many sites still don’t use SSL, and their cookie-based authentication can be replayed.
- Likewise, if (in the words of Bad Lieutenant’s Harvey Keitel) I Do Bad Things — for whatever that means in my current context — I probably don’t want my neighbour / family / boss / government looking over my shoulder.
In both of these cases, however, it’s less intrusive to establish a trust relationship with a third party — e.g., using a TLS-encapsulated HTTP proxy, or a full VPN — and use that service to avoid these issues. Both approaches are usable today.
The fact that these services aren’t taking off like gangbusters tells me that Mike’s “the users want it” isn’t the whole story.
The Cost
The other half of the story is the lost opportunities of making TLS mandatory.
The Web is built upon intermediation — whether it’s your ISP’s proxies, your IT department’s firewalls and virus checkers, Akamai’s massive farms of content servers, or the myriad other ways people use intermediation (yes, that’s a plug for my latest talk). SPDY is not intermediary-friendly for several reasons, but wrapping it all in mandatory TLS makes it a non-starter. Mike’s assertion that use of proxies is “easing” isn’t backed by any numbers that I’ve seen.
Secondly, the server-side cost of TLS is still an issue for some. Sure, if you’re Google or another large Web shop, you can afford the extra iron and the insane amount of tuning that’s necessary to make it work. If it is as easy as Mike paints it on the server side, and if the users want it, why is TLS still relatively rare on the Web?
Mike also scoffs at those who point out that it’ll make debugging more difficult, brushing this concern aside as supporting the habits of “lazy developers.” I don’t think this is fair; the Web and the Internet took off at least in part because it was easy to debug. Those huge stacks of ISO specs didn’t win at least in part because they weren’t. Again, not everyone has the ability to hire Google rock star developers.
Obviously, the characteristics of SPDY-over-TLS works really well for Google. However, the Web is not (yet) just Google, and any big change like this is going to affect a lot of people.
Is It Political?
To me, requiring TLS in an application protocol feels like a political decision, not a technical one. Good protocols are factored out so that they don’t unnecessarily tie together requirements, overheads and complexity. “Small Pieces Loosely Joined” isn’t just a saying, it’s arguably how both Unix and the Internet were successfully built.
I’m quite sympathetic to arguments that government snooping and interference is bad — whether it’s American, Chinese or Australian — but protocols make very poor instruments of policy or revolution. Governments will work around them (either with the finesse of getting back doors in, or the brute force approach of blocking all encrypted traffic).
Can we improve things? Sure.
All of this is not to say that we can’t make things better incrementally, without resorting to the all-or-nothing approach. Starting by make SSL/TLS better, along the lines that Mike and others have talked about, is a great start; when we do have to use it, it needs to be as easy as possible, both for the end user and the server side.
First, there’s a fair amount of current interest in — and at least one group actively working on — signing HTTP responses. If we can verify the integrity of the response body and headers with low overhead, a whole class of issues goes away without adversely affecting the Web. If it’s done correctly, you’ll be able to tell at a glance whether the content you’re looking at has been changed along the way, or cached outside of its stated policy.
Second, for the cases when the user does want to opt into privacy, we need to make SSL proxies easier to use.
Finally, HTTP Authentication needs to be better. Not a big surprise, really, but Cookies are a very limited and tricky-to-get-right vessel for credentials. This isn’t an easy problem (mostly because once you start defining a new authentication scheme, you quickly find yourself boiling an ocean), but again I’d say it’s easier than requiring TLS for the entire Web.
Wednesday, 30 June 2010
Patricia Clausnitzer has kindly translated the Caching Tutorial to Belarusian. Thanks!
Monday, 21 June 2010
A few weeks ago I was browsing through My Bookshop in Hawksburn, where on a whim I picked up The Winter of Our Disconnect by Susan Maushart. As I write this, I’m at 30,000 feet, and have just finished one of the more enjoyable and informative reads I’ve had in a while.
Maushart is a Perth-based journalist and single parent of three who, questioning the effect of media — from TV to iPods to video games to mobile phones — decides to enforce a six-month ban of technology in the house, and write about the process (in longhand, of course).
I concluded my announcement, eyes ablaze with missionary zeal (also fear), ‘It’s an experiment in living. We are all going to do it together, as a family. And it’s going to change our lives.’ There was a frozen pause. If life was a Macbook, this was our spinning colour wheel of death.
Sussy broke the silence.
‘You mean … like Wife Swap?’ she asked.
‘YES!’ I roared. Bless the baby for throwing me a life raft. ‘Exactly like reality TV! Exactly! Except, of course, we won’t have a TV …’ I trailed off. I could see Bill and Anni exchange glances.
‘What about homework?’ Bill asked cannily.
‘You can do it at the library, or at a friend’s house, or at home using …’
‘What? A stone tablet and a chisel?’ Anni snapped.
What makes this a great book is its balance, in several senses. Maushart alternates between fairly raw journal entries of her family’s experiences, a running commentary with a bit more distance and analysis, and dives into various bits of relevant research in the field.
The book also strikes a good balance between thought- and laugh-provoking prose. I think most modern families will recognise themselves in these pages, producing both grins (LOL) and shudders of recognition.
Most importantly, though, Maushart is neither a digital apologist nor a luddite. This book is not an argument to shut your screens off permanently; it’s a wake-up call to examine how the media that we use uses us. Do you need to be contactable by five different means on a 24/7 basis? What is the opportunity cost of playing a game in a way that’s equivalent to a part-time job? Is shutting out the world with iPod buds by default a good thing? And so forth.
One of the most interesting points the WoOD makes is about multitasking; this notion that you can simultaneously work (or do homework, in the kids case), have five chat windows open, be reading four (or fourteen) websites and listening to music and/or talking on the phone.
In short, it’s rubbish; Maushart cites a raft of research showing that while people feel more productive when they’re doing more than one thing at a time, the actual quality and quantity of their output nosedives. This includes teenagers, no matter how strenuously they claim that their brains are “wired differently.”
Personally, even before I started reading this book, I’ve already been half-counsciously trying to think about when the TV is on, how often I really need to log into Facebook (answer: just about never), and perhaps most difficultly, how often I have to check my e-mail. Reading Maushart’s book brought all of this into much cleaner focus.
Of course, the fact that I’m blogging this should tell you something, but like I said, it’s all about balance…
Thursday, 3 June 2010
A while back we used an absurd amount of reward points from our credit card to get some Myer gift certificates, and on the weekend these miraculously turned into a new TV, the Sony 32EX600.
Overall, we really like it; while I’m still trying to find the exact recipe to successfully encode video to feed to it via DLNA, it’s beautiful to look at, and (unlike most TVs these days), the UI is a pleasure to use.
Except.
One of the big selling features of the TV is that it offers “Internet Streaming” as well as DLNA. Basically, this means that it can stream directly from YouTube, the Onion, and a number of other places that will soon include back episodes of most programs on at least two major Australian networks.
That’s very cool, and kudos to Sony for making arrangements for content in the local market. However, actually using this feature from Australia — the same market they’re customising a TV for — is less-than-impressive.
That’s because every time you access one of these “Internet Streaming” channels, the TV makes not one but up to five SSL connections serially to a server in the US.
1275275765.332 1044 192.168.1.19 TCP_MISS/200 4157 CONNECT ssm.internet.sony.tv:443 - DIRECT/64.37.180.11 -
1275275766.432 1012 192.168.1.19 TCP_MISS/200 3845 CONNECT treb.internet.sony.tv:443 - DIRECT/64.37.180.15 -
1275275767.456 905 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 4139 GET http://ssm.internet.sony.tv/BIVL/icons/service_23/sub_2/h.png - DIRECT/64.37.180.11 image/png
1275275767.836 1290 192.168.1.19 TCP_MISS/200 5036 CONNECT treb.internet.sony.tv:443 - DIRECT/64.37.180.15 -
1275275769.276 1294 192.168.1.19 TCP_MISS/200 5139 CONNECT treb.internet.sony.tv:443 - DIRECT/64.37.180.15 -
1275275771.066 1745 192.168.1.19 TCP_MISS/200 32170 CONNECT treb.internet.sony.tv:443 - DIRECT/64.37.180.15 -
Since they’re serialised, it means we have to wait each time for the TCP connection to come up, the SSL context to be established, and the HTTP request and response to be received before you see a byte; hence the latencies of anywhere from about one to two seconds (second column, in milliseconds) from Australia, since their servers are in the US:
7 ge-7-11.car3.losangeles1.level3.net (4.71.32.61) 188.653 ms 187.361 ms 187.985 ms
8 ae-71-70.ebr1.losangeles1.level3.net (4.69.144.114) 212.445 ms 211.550 ms 215.080 ms
9 ae-5-5.car1.sandiego1.level3.net (4.69.133.205) 200.850 ms 200.829 ms 200.435 ms
10 ge-1-2.hsa2.sandiego1.level3.net (4.69.142.162) 201.915 ms 200.725 ms 201.477 ms
11 vl862.sdtermswitch-2.sonyonline.net (63.212.173.146) 201.194 ms 201.141 ms 201.253 ms
12 vl832.sdkollsw-2.sonyonline.net (64.37.144.90) 191.396 ms 192.539 ms 191.285 ms
13 * * *
Which means it’s about five seconds before you see anything come up in this interface, despite the fact that there’s nothing personalised or particularly dynamic in the content. Ouch.
After that, you get a screen with a bunch of icons for different shows and/or episodes on it, but again the TV doesn’t want to play nicely; not only are many of the responses uncacheable, but the TV also sends Pragma: no-cache on everything:
GET http://www.videodetective.net/utils/dynamicthumb.aspx?filepath=6747/28341022_.jpg&width=128&height=96 HTTP/1.1
Host: www.videodetective.net
Pragma: no-cache
Accept: */*
Proxy-Connection: Keep-Alive
...which results in a lot of cache misses:
1275275772.242 809 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 3043 GET http://www.videodetective.net/utils/dynamicthumb.aspx?filepath=2808/11795832_.jpg&width=128&height=96 - DIRECT/65.52.12.234 image/jpeg
1275275772.276 906 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 4502 GET http://treb.internet.sony.tv/content/thumbs/videodetective/CAT27178471f2c4019c872d1545a0f154c7.png - DIRECT/64.37.180.15 image/png
1275275772.337 909 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 5727 GET http://treb.internet.sony.tv/content/thumbs/videodetective/CATf2407056c089170d4df9dac66a284f38.png - DIRECT/64.37.180.15 image/png
1275275772.882 1509 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 4388 GET http://www.videodetective.net/utils/dynamicthumb.aspx?filepath=6452/27101922_.jpg&width=128&height=96 - DIRECT/65.52.12.234 image/jpeg
1275275773.213 1757 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 2901 GET http://www.videodetective.net/utils/dynamicthumb.aspx?filepath=6340/26630733_.jpg&width=128&height=96 - DIRECT/65.52.12.234 image/jpeg
1275275773.491 2030 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 4355 GET http://www.videodetective.net/utils/dynamicthumb.aspx?filepath=6706/28166929_.jpg&width=128&height=96 - DIRECT/65.52.12.234 image/jpeg
1275275773.887 366 192.168.1.19 TCP_CLIENT_REFRESH_MISS/200 3372 GET http://www.videodetective.net/utils/dynamicthumb.aspx?filepath=6767/28425441_.jpg&width=128&height=96 - DIRECT/65.52.12.234 image/jpeg
…and so on. All up, it takes about twenty — yes, twenty — seconds to load a page with a few thumbnails on it.
It’s not just the TV’s fault, to be fair; the content providers are providing things like this — which as per above takes nearly a second to load from Australia, and is uncacheable. Looking at it in REDbot, we see why; they send Cache-Control: private, meaning that shared caches like mine can’t store this static, non-personalised image.
It’s possible to fix some of these problems in Squid, luckily. The recipe I have so far is:
# Sony Bravia doesn't like the Web.
refresh_pattern ^http://ssm\.internet.sony\.tv/BIVL/icons/ 2880 50% 10800 ignore-reload
refresh_pattern ^http://brevia\.condenet\.com/.*\.jpg 2880 50% 10800 ignore-reload
refresh_pattern ^http://treb\.internet\.sony\.tv/content/thumbs/ 2880 50% 10800 ignore-reload
refresh_pattern ^http://www\.videodetective\.net/utils/dynamicthumb.aspx 2880 50% 10800 ignore-private override-expire ignore-reload
refresh_pattern ^http://images\.onnetworks\.com/ 2880 50% 10800 ignore-reload
Here, you can see the ignore-reload option which tells Squid to ignore the TV’s Pragma: no-cache, as well as ignore-private for the videodetective URLs.
Of course, I shouldn't have to do this, and to really improve things, Sony needs to use something other than that dance of CONNECTs to view content; making multiple serialised SSL connections from halfway around the world is just not good user experience.
Thursday, 6 May 2010
On a bit of a roll, RFC5861: HTTP Stale Controls has (finally) been published as an Informational RFC.
As discussed before in “Two HTTP Caching Extensions,” these are very useful ways to hide latency and errors from your end users. While they’re most useful in HTTP gateway caches (a.k.a. reverse proxy caches / accelerators), very latency-sensitive sites might find them useful as well when working with “normal” proxy caches.
Both are implemented in Squid 2.7. Not only does Squid respect both response Cache-Control directives, but it also allows you to tweak its behaviour using the stale-while-revalidate and max-stale refresh_pattern options. Squid 3.2 should have them when it’s released, and I understand that Apache Traffic Server will have stale-while-revalidate available soon as well.