mnot’s blog

Design depends largely on constraints.” — Charles Eames

Monday, 9 May 2005

Greasemonkey and the Web

There’s a lot of cool apps emerging for GreaseMonkey (and GreaseMonkIE and PithHelmet, for IE and Safari respectively). It seems like these extensions have a love/hate relationship with the Web, philosophically.

On the one hand (with L O V E sprawled across the knuckles), GM is a great example of taking advantage of the representational nature of the Web. SOAP-heads would call this “loose coupling” or “document-oriented.” The fact that you can rock up and modify somebody else’s content to suit your purposes and/or tastes is a direct result of this.

On the other hand (the one bearing H A T E on its phalanges), GM is a browser plug-in that you have to download and install, and then you need to download (or write) extensions and configure them for your browser. This means that if you use another browser or machine, you either have to go through it all again, or lose your modifications. If you want to share what you do with someone else, you have to hold their hand through this entire process.

In other words, the Web was built by having as little client-side state as possible. Keeping GreaseMonkey and all of the scripts people use up-to-date is a management nightmare that the Web solved a long time ago.

So, while GM is fine for geeks (who are always willing to install more software), it doesn’t work too well for normal people. URIs and browsers are built to make it easy and fuss-free to get what you want; downloading multiple pieces of software and cooking up a recipe to get them to act like you want seems more reminiscent of FTP than the Web.

A More Web-like GreaseMonkey

The obvious Web-like way to interpose services is through intermediaries (proxies and gateways).

A GreaseProxy would act like a normal HTTP proxy; you’d configure it in your browser and then forget it. Using HTTP proxy authentication, it would remember who you are, and what extensions/modifications you’ve requested (through a configuration page on the same server as the proxy), modifying your HTTP traffic as it went by.

This would work for anyone who doesn’t need a proxy* to access the Web. Imagine selecting from a growing collection of scripts on a Web page, and having that available wherever you are.

For casual modification of pages, a GreaseGateway will do the trick, much in the same fashion that CritSuite and countless other services work today; a central Web page with a list of available services and maybe a bookmarklet would make it easy to use.

The great thing about this is that GreaseProxies could be deployed in a company, on the open Internet for a fee, and even on your own machine for when you’re on the road; you’d be able to share services between colleagues, friends and your different browsers. For example, a company could deploy a proxy that allowed employees to mark up Web pages, CritSuite-style.

Probably the biggest barrier to such things getting adoption is that proxies and gateways aren’t exactly developer-friendly; it’s a lot easier to fiddle with a browser extension when you’re getting something right, and that’s why GreaseMonkey has so much interest. That’s why GreaseProxies and GreaseGateways should accept an existing GreaseMonkey script with no changes, so they can leverage GM’s popularity and give us the best of both worlds.

Anybody want to start coding?

* People Who Know tell me that the majority of proxies in the world today are sold to be used “transparently” (that is, without browser configuration), so this is an increasingly small problem. Someone who does need a proxy could either chain a GreaseProxy in front of the access proxy, or use a GreaseGateway.


Filed under: Web

20 Comments

Paul Downey said:

It would be cool to have a proxy which ran greasemonkey scripts, especially ones of my choosing, not any old script imposed upon me. The possibilities for phishing and spamming would be scary. One an alternative could be for other browsers to implement GreaseMonkey - then the first thing you do in an internet cafe being to privide the uri of your profile containing your personal selection of scripts. Hmm .. not sure I like that either ..

I notice Mark Pilgrim has a new book:
http://diveintogreasemonkey.org/

Wednesday, May 11 2005 at 2:37 PM +10:00

Jason said:

I gotta say, the idea of a GreaseProxy is just plain awesome. I've always imagined that an extensible platform for midstream content enhancement would catch on quickly; I remember coding my own web-based proxy about six years ago (you know, there's a URL field that you use to type in address of the site you want to see, it loads and parses it, rewrites all the links so that when clicked, the same web-based service grabs the content, and then displays the whole thing in the browser window).

Watch out, though, lest the content-is-sacred assclowns start attacking you for modifying content in the browser!

Thursday, May 12 2005 at 12:59 PM +10:00

Adrian Holovaty said:

Interesting idea! I don't think it's realistic, though, to leverage existing Greasemonkey user scripts, because they're written in JavaScript and modify the DOM based on Firefox's DOM implementation -- not by doing searches-and-replacements on the raw HTML of the page, like traditional Web proxies do (such as, probably, the one Jason alluded to).

Your proxy would somehow have to get low-level access to Firefox's DOM implementation. Seems like that would involve a lot of voodoo magic. On second thought, sounds like a fun challenge! ;-)

If you're concerned about how much of a pain in the butt it is to get people to use Greasemonkey scripts (and I share your concern), may I suggest the Greasemonkey compiler? http://www.letitblog.com/greasemonkey-compiler/

Friday, May 13 2005 at 2:11 PM +10:00

l.m.orchard said:

I wonder how far this could go toward helping build a proxy with a built-in DOM and JavaScript interpreter:

http://wwwsearch.sourceforge.net/python-spidermonkey/

Maybe pull in a little Twisted for the proxy, build some compatible facsimile of a browser DOM...

Friday, May 13 2005 at 5:56 PM +10:00

Ian Bicking said:

Doesn't the proxy just have to add a couple script statements and remap select URLs so the Javascript can appear to come from the same host as the original page? It all just seems to be an effort to deal with the Javascript cross-domain security issue, rather than avoiding Javascript altogether.

Friday, May 13 2005 at 11:32 PM +10:00

Mark Nottingham said:

Ian —

That’s what I’m hoping. Note that if it’s a true proxy, I don’t think it needs to remap URLs at all (because it’ll look like the same Web site, from the Browser’s point of view); it’s only when it’s a gateway where you’ll have to be careful about rewriting URLs (which is a fairly well-understood thing; witness all of the existing rewriting gateways).

Saturday, May 14 2005 at 5:56 PM +10:00

Larry Underhill said:

The problem with proxies (err, the well known proxies at least) is that content providers have a vested interest in blocking them. If I am Amazon and I know that a finite number of proxies are running GM scripts to add relevant links to competitor sites, I will block those IP addrs (and hey, the Amazon techies are good -- they will figure this out).

I can easily imagine a blacklist (akin to what those poor bastards who sys-admin email servers) that content providers subscribe to that can even automate this procedure.

/me ducks. Ignores the frantic protests of those who speak of the mutant children of Tor and Greasemonkey.... :)

Tuesday, May 17 2005 at 7:05 PM +10:00

Mark Nottingham said:

Larry —

You really think so? I think Amazon wouldn’t dare; they’d be blocking potential customers (both those that use an Amazon-mutating script, and others). If this were a problem, I’d think they’d already be blocking Mozilla itself, because it also has the potential for changing their pages.

Wednesday, May 18 2005 at 9:13 AM +10:00

Mark said:

> Doesn't the proxy just have to add a couple script statements and remap select URLs

No, that's not what Greasemonkey does. It injects the scripts one at a time and executes each in turn, and removes each of them from the page immediately after executing.

Also, Greasemonkey provides its own API of functions that user scripts can call to do things that ordinary Javascript can not do. For example, GM_xmlhttpRequest can retrieve data from any URL, even other sites. This is how all of the "mash-up" scripts work, like the ones that add data points to Google Maps. Unprivileged Javascript can only make requests to the same site (via the XMLHttpRequest object).

Note that I am not disagreeing with the original idea. Having a GreaseProxy would be extraordinarily useful to me. It's just more complicated than you think.

Friday, May 20 2005 at 4:33 PM +10:00

Ken Meltsner said:

Been there, done that, to some extent. I've wanted GreaseProxy ever since I heard of GreaseMonkey, and have followed previous efforts to write smart group-oriented proxies as well -- I put together a DARPA proposal back in 1995 to write a really cool one, for example.

Rewriting proxies and their friends keep getting proposed, some get written, and almost all of them fail to gain the critical mass required to keep going.

I hate to pessimistic, but here's a short list off the top of my head (and I'm six time zones from where I should be...): PIA from Ricoh, OREO from OSF, Noodles from CollabNet, ActiveProxies from U Wisconsin. There are also a couple of commercial products -- Kapow Robosuite, for example -- but they're usually more toolkits with good HTTP client libs and rewriting features than something end-user friendly like a GreaseProxy should be.

There have been a few somewhat successful rewriting proxy servers, including IBM's transcoding server (mostly for handheld access to existing sites), mod_accessibility and mod_rewritehtml for Apache, the security reverse proxy from Netegrity, and other I can't remember right now. What they have in common is a focus on fixing *one* aspect of Web access, not providing a general capability. Side note: ICAP is sort of a greasemonkeyish approach built on top of HTTP. It's intended for smart caching and virus detection, I think, but it's pretty close to the right idea: provide an easy way to extend an existing proxy server or gateway with functions provided by close-to-vanilla HTTP servers.

Perhaps I'm wrong, of course. One of the reasons I think a GreaseProxy would be cool is that it should use the same scripts as the browser-based GreaseMonkey -- the more users, the more likely it is that it will get the audience it needs to survive. The single biggest issue is security of the scripts. You don't want arbitrary rewrites of pages foisted upon unsuspecting users, which means you need security, or at least enough security to make sure that scripts can't be added without permission.

By the way, thinking about this a bit more, it would be really cool to implement GreaseProxy with ICAP -- you could have multiple GPs to handle high load, and the hooks for ICAP are already available in several Web caching server products.

Sunday, May 22 2005 at 4:57 AM +10:00

Mark Nottingham said:

Thanks for that summary, Ken. I was one of the original team that put ICAP together (Peter Danzig hired me into Akamai). Of course, there are lots of ways to implement an intermediary, ICAP (or its purported successor in OPES, I forget the name now) just gives you hooks into existing engines, which takes some of the load off of your proxy.

Overall, I’m not convinced that more advanced intermediaries are held back because of their generality; in many respects, that market is still very young. We did content assembly — a generic function if there ever was one — at Akamai, with ESI, and while it hasn’t taken the world over (I was actually pretty pessimistic), by all accounts it’s very popular with some very large customers, both of Akamai and Oracle (who came up with it) and now IBM.

Perhaps what’s different here from what you’re talking about is that the end user is able to go to a Web page on the GreaseProxy and configure how it will behave for them; in that way, it isn’t a general function, but a configurable purpose-specific one.

Sunday, May 22 2005 at 9:20 AM +10:00

Tjaard said:

Hmm... the main problem to be tackled is that people wish to use their greasemonkey scripts on multiple machines in an easy way. I don't think that installing an extension is the actual problem (one wishes to customize one's browser profile anyway -- I don't see the difference between configuring a proxy or installing an extension). Synchronisation is. The guys from Adblock Plus enhanced Adblock so that one can subscribe oneself to an arbitrary filter set. It would be easier to implement, everyone can host their scripts and no substantial bandwidth is needed as the demand grows which would be the case with an online proxy. And there'd be no way to tell from the server side that you're using it, so the proxy blacklist problem wouldn't exist either.

Monday, May 23 2005 at 2:04 AM +10:00

Ken Meltsner said:

Coming from a corporate software world, I'll assert that the most common "enterprise" use will be to mandate GM scripts for specific sites. The most common requests we received for one of our Web products were best described as "depersonalization" requests -- removing various configurable options to ensure every user (within a target group) had exactly the same experience.

Monday, May 23 2005 at 2:43 AM +10:00

Aaron said:

Hi, I wrote Greasemonkey.

This is an awesome idea. One thing to note is that GM has extensions to the browser DOM which cannot be accomplished without the client. GM_xmlhttpRequest for instance, can do cross-domain requests, and GM_setValue and GM_getValue store key/value pairs locally.

So simply executing GM scripts on the server and sending the resultant HTML is not enough.

I really love the idea of sending HTML with the required scripts already embedded, but this becomes more of a centralized configuration than a proxy proper. More like what Tjaard was saying. Still might be worth doing, but the same thing could be accomplished with a "save configuration as zip" button in GM and an ftp server.

Monday, May 23 2005 at 2:03 PM +10:00

Jay Fienberg said:

A different approach: a "G-proxy" plugin in the browser that connects Greasemonkey to a web service that maintains your collection of Greasemonkey scripts and configurations.

In other words, the proxy is for Greasemonkey's script access, separate than for the browser's web page access.

An individual or a group of folk then have a central respository of Greasemonkey scripts and configurations their browsers use.

When your browser's Greasemonkey sees scripts and their configuration, it's getting code/config available through the G-proxy plugin.

btw, I think your GreaseProxy idea is great. But, because of the depth within Firefox that Greasemonkey seems to function, maybe it makes more sense for Greasemonkey to get network savvy in this way.

Monday, May 23 2005 at 5:11 PM +10:00

Mark Nottingham said:

Aaron —

Thanks for your kind comments. My goal would be to do it with an unmodified browser, but I realise that might be asking a lot.

I wonder how much you could do on a proxy with a bit of creativity; e.g., if you embedded a frame that had an embedded GM_xmlhttpRequest function that rewrites the URIs so that they appear like they’re on the same server, while having the proxy un-munge them, putting local values into cookies, etc.

Of course, it’d take a whole lot of digging to see if that’d work, this is just me speculating.

The other approach would be to profile GM’s API to make it remotely executable; are there any stats on how often these functions are used?

Monday, May 23 2005 at 8:33 PM +10:00

Aaron said:

GM_xmlhttpRequest could definitely be implemented this way with enough hacking.

GM_setValue and GM_getValue would be a little more difficult: they are designed to be fast synchronous calls.

You'd either have to do them as cookies or as asynchronous requests to the server. Both have security and functional implications. Probably nobody who is using GM_setValue is actually expecting the result to be there immediately, but they probably do call it too frequently for web stuff.

But hey, you could just fallback on documentation: "works best with Greaseproxy" or "doesn't work so well with Greaseproxy".

In general this could totally work. Cool.

Monday, May 23 2005 at 9:07 PM +10:00

bryan said:

why would you not just do something like use privoxy http://www.privoxy.org/faq/configuration.html#AEN404 or any of the other configurable proxies out there?

as it is open source could be used as a base for greaseProxy. Although I am not sure what additional capabilities a greaseProxy should have?

Saturday, May 28 2005 at 1:26 PM +10:00

_why said:

Lotsa good ideas in here. I've gleaned a lot for use in MouseHole [http://mousehole.rubyforge.org], a GreaseProxy which uses Ruby for its user scripting. The next release will support the GM_API. I just wanted to stray from the GM_API and come up with something unique to avoid being too influenced by it.

Some great scripts are already surfacing for this proxy. Since the proxy allows scripts to mount their own little applications, a lot more can happen on the user's side. Someone just wrote a script that's basically a wiki which stores versioned user scripts. You type the wiki page name into the browser and the user script stored in the wiki runs.

A great point for having the proxy separate from the browser is that you can run the proxy on a central machine and several people can share rewrite scripts. And you don't have to install all your scripts on every machine you're on.

Anyway, if you want a tutorial, the announcement on RedHanded has screenshots. [http://redhanded.hobix.com/inspect/mousehole11InPlainView.html]

Sunday, September 4 2005 at 7:47 PM +10:00

victoria said:

good point put still....aren't there any other options you could take?

Tuesday, December 5 2006 at 11:17 AM +10:00

Creative Commons