Intermediaries are often used to scale the Web infrastructure, offering improved performance, availability and scalability of services. This paper outlines an approach to scaling Web Services (services exposed by XML messaging solutions like SOAP [SOAP] and XML Protocol [XMLP]) through optimisation in intermediaries, and proposes further work which leverages XML Protocol's features to help scale them and improve performance.
As the Web has grown, there has been an increasing need to address issues such as the scalability (the ability to handle growth in load efficiently), performance (as measured by the end users' perceived latency) and availability of Web sites. Intermediary solutions to these problems have been used for some time; first as caching proxies, and later in Content Delivery Networks, which use surrogates (intermediaries which act on behalf of the server, rather than the client) dispersed widely over the network to provide service.
HTTP intermediaries rely on certain aspects of requests and responses flowing through them to offer these benefits. For example, caching works because requests can be easily identified (through the Request-URI and a few headers), and the responses associated with them can be reused according to the HTTP's cache coherence rules.
While Web Services use lower-layer intermediaries (such as HTTP proxies and surrogates or SMTP relays) for transit, those devices' optimization mechanisms aren't aware of the semantics of a Web Services message, and therefore can't fully exploit them.
For example, while it's possible to describe cacheability for a Web Service's underlying HTTP response (e.g., with a Cache-Control or Expires header), this is only a basic means of providing hints about the message. In the best case, an HTTP cache would be indexed on the entire request as a string, meaning that a trivial change in the request's XML would cause it to be indexed as a separate entity.
This means that, by default, only the simplest Web Services can leverage intermediaries for optimization in very limited ways.
To allow intermediaries to more fully optimise Web Services, such optimisation need to be standardized, just as cacheability information for lower-layer HTTP entities is standardized in the HTTP. This will allow intermediaries to optimise a broad range of Web Services without service-specific code being executed.
This paper outlines potential use cases for Web Service intermediary optimization, proposes a framework for optimization, and identifies potential optimization techniques.
Although tentative, these use cases help illustrate the potential of optimizing Web Service intermediaries.
The following sections describe a proposal for a Web Service optimisation framework. It consists of optimisation techniques, which describe a way to exploit a particular behaviour to improve service. Techniques can be targetted at particular parts of messages through optimisation scoping, and often are invoked by arbitrary events, which we term optimisation triggers.
This proposal is based on the following requirements:
We envision using the techniques, scoping and triggers together to form an XML Protocol Optimisation Language. Here, techniques, triggers and scoping mechanisms are presented as modular concepts. While we hope to keep the language as modular, and therefore expressive, as possible, this goal must be balanced against the requirement for ease of use. As a result, the actual language might not allow direct access to them, but instead introduce them in 'packages' of functionality.
XML offers an ideal way to control the scope of optimisation to portions of a message, because there are a number of ways to associate hints with a particular XML element or hierarchical group of elements (up to the scope of the entire message).
The most obvious means is through use of attributes in a separate XML Namespace in the document itself. For example, if an element 'foo' and its children are cacheable, it could be expressed as
<foo opt:invalidate="yes" opt:delta="5m"> ... </foo>
or
<opt:cache invalidate="yes" cache:delta="5m"><foo>...</foo></opt:cache>
However, this requires the intermediary to strip the elements targetted at it, and for those processors beforehand to ignore the opt namespace. As a result, it may often be preferable to describe optimisation outside of the document. This may be done in a WSDL file, in the XML Schema or TREX description of the message, or an XML Protocol header block in the message, using XPath.
The scoping mechanism used in a service depends on many aspects, including whether the technique's application is static (and can therefore be stated in advance) or dynamic (and therefore must be expressed in-message). Additionally, aspects of a service's deployment, including the nature of the optimising intermediaries, may influence the scoping mechanism used.
Many techniques require a description of when to apply them; for example, a cached object needs some event to invalidate it; similarly, a message store needs to be told when it is appropriate to forward a message. To accommodate this, a variety of trigger mechanisms could be defined;
Some applications may find it useful to combine triggers; for example, 'five minutes after a message containing the "action" element arrives'. As a result, the syntax should support arbitrary combinations of triggers as well as simple trigger events.
There is a rich history of optimisation techniques in protocol design and computer science in general that we can draw from. Here, we attempt to separate them into general mechanisms that may be combined to allow services to more powerfully and exactly control how service intermediaries handle their messages. This list draws primarily from techniques used in the HTTP, which in turn benefitted from experience in distributed filesystems [DFSScale].
By allowing clients to keep and reuse copies of entities, efficiencies are realised by either the avoidance of data transfer, or the avoidance of a round-trip to the server altogether. Caching techniques rely on locality in usage patterns; that is, the likelihood that portions of messages can be reused.
To be able to reuse an entity, a cache must understand the conditions under which it is appropriate to do so. Cache indexing defines the profile of request semantics in which a particular response may be reused. The most obvious way to index a cache is based upon Services' URIs, as HTTP does. This provides a namespace for cache lookups to be performed in.
For more complex applications, it may be necessary to modify the cache index depending on other attributes. For example, HTTP allows the 'Vary' response header to specify which additional request headers should be used to index the cache, allowing objects with separate language attributes to be stored under the same URI, for example. This content negotiation feature is crude in the HTTP, but could be much more expressive using XML.
Conversely, there may be situations where a Service URI-based cache index may be too restrictive; it may be useful to expand the scope of the cache index to include multiple resources, to allow entities to be reused across services. To accommodate these situations, it should be possible to declare a 'virtual' cache index that different resources can interact with.
Furthermore, a Service much have some control over the entities stored in a caching service intermediary. Cache coherence mechanisms provide this, typically through the use of validation (actively checking to see whether an entity should be reused) and invalidation (marking the content as 'stale' based on some trigger event).
Some Services consist of the submission of a message as the request, and a brief acknowledgement as a response, in a manner similar to SMTP's store-and-forward pattern. Standardization of an acknowledgement message would allow intermediaries to take responsibility for handling requests whilst immediately acknowledging them. In combination with caching and other techniques, store-and-forward allows intermediaries to improve service reliability substantially, by making it possible to have multiple, redundant points of contact for message submission, with the possibility for performance improvement through client/intermediary locality.
Often, it is only necessary to transmit part of a message. For example, a server may only need to update part of a cache's stored message [Delta], or it might be desireable to store the bulk of a message, while forwarding a smaller part immediately.
To accommodate this, partial content techniques allow specification of what parts of a message should be sent.
In some situations, intermediaries need to send or receive a number of separate messages to or from a particular device. Although some transport bindings may make it possible to reuse a network connection for these messages, further processing efficiencies might be realised by their combination into a single message. For example, it might be desirable to send all store-and-forward messages for a Service at once, wrapping all of them in a master message that uses an encryption module to protect them. If used across an HTTP binding, this approach avoids the overhead of separately encrypting the messages and then submitting each one and waiting for a response to indicate success.
Similarly, there may be situations where it is advantageous to 'piggyback' responses to give additional information to the intermediary. Previously, piggyback validation techniques have been examined in the HTTP [Piggyback], and such techniques could also be used with service intermediaries to pre-fill the cache, bundle invalidations, and perform other tasks.
This paper has outlined areas of research regarding optimising mechanisms in service intermediaries; they are intended as a discussion point only. Hopefully, they will generate interest in standardization of such techniques, development of a framework for their use, and integration into Web Service toolkits and products.
SOAP - D. Box et. al. "Simple Object Access Protocol (SOAP) 1.1". May, 2000.
XMLP - W3C XML Protocol Working Group
DFSScale - M. Satyanarayanan. "The Influence of Scale on Distributed File System Design". In IEEE Transactions on Software Engineering, January 1992.
Piggyback - Balachander Krishnamurthy and Craig E. Wills. "Piggyback Server Invalidation for Proxy Cache Coherency". In Proceedings of the Seventh International World Wide Web Conference, Brisbane, Australia , April 1998.
Delta - J. Mogul et. al. "Delta encoding in HTTP", October, 2000.
Version: 1.2 - April 17, 2001