The increasing number and complexity of XML-related specifications (e.g., Namespaces, XSLT, Schema, XInclude, XBase) as well as inherent functions of XML (entity resolution and validation) have created the need for an XML processing model, in order to disambiguate the order and depth of processing when applying these mechanisms.
There are already a number of efforts to define a distributed processing model for the Web, encompassing proprietary efforts, embryonic efforts at standards (e.g., ESI and OPES), and as parts of other W3C projects (e.g., XMLP). This paper considers this effort in the greater context of Web processing models, and proposes one approach to the issues therein.
In this document, we use the term processor to denote a single processing function, as outlined above, as opposed to an XML processor, which may incorporate a number of these atomic processors.
The original motivation for defining a processing model was to normalize the application of W3C-defined mechanisms; it is a very different thing to apply XSLT and then XInclude, as opposed to XInclude and then XSLT. Whilst it would be relatively simple to define a static pipeline for XML processing to solve the issue, this robs applications of the flexibility inherent in XML. This is because it may be equally desireable to use either model, depending on an application's architecture.
If indeed we require a flexible processing model, it follows that it should be extensible. XML is still young, and its very nature (the 'X') means that the W3C will in all likelyhood continue to define and refine what can be considered 'core' mechanisms for working with it. We would assume that any such mechanisms which are reflected in the XML itself would be disambiguated in the document by XML Namespaces. This leads to the need for a general processing model that is dispatched at least partially by namespaces.
Such a model offers the ability not only to dispatch W3C-defined processors, but also arbitrary, user-defined processors, as long as their syntax is namespace-qualified (when it is reflected in the document). There is evidence that there is considerable demand for a mechanism to accommodate processing in this fashion, as many efforts [JSP][ESI-LANG] are already using namespaces to imply dispatch semantics, even though they are not designed to accommodate this in their current form.
In such a view, XML Core mechanisms such as the application of XInclude, Schema validation and so forth are considered steps in a processor pipeline that also includes any number of other operations. The operations may be conceptualized as a flow, with two endpoints and a number of possible intermediate states.
Once this is accepted, it becomes clear that a number of issues must be considered;
Increasingly, usage scenarios for XML involve being touched by a number of processors during its lifetime. An editor may emit XML, which is stored, then transformed by XSLT, then transmitted to another machine, where it might be processed by a SOAP processor, transformed again, and then sent to a browser as input for rendering.
This highlights the dimensions in which an XML document may be processed; not only may different processors operate in a single operation, but they may be separately invoked upon a document a number of times on a single node, or on different nodes.
This may lead to issues when including processing directives in a document. For example, if an XML Protocol message contains an XInclude directive, it is unclear whether the originating node, an intermediary or the ultimate receiver should apply the XInclude mechanism; each scenario is equally valid, in different situations. While it would be easy to resolve this through the use of a different namespace for the XInclude, there still needs to be a mechanism that directs when to apply it. XML Protocol provides this through the Module mechanism; however, this solution is not suitable for all situations (especially where processing is not hinted by a namespace, such as entity resolution, schema validation, etc.).
Additionally, processing may take place on behalf of different parties; whilst processing on a Web server may be on behalf of the publisher of the XML, for example, it may be re-processed on behalf of an intermediary server (which itself might act on behalf of a number of parties), and again it may be processed on the User-Agent, on behalf of the end user.
These dimensions closely align to those in discussed in relation to distributed Web servers and services. The XML Protocol processing model is most familiar to this audience, but other suggestions have been made for solutions; the ESI Architecture document suggests a pipeline model whereby HTTP headers are used to determine the order of processing for an entity, as well as targetting that processing to particular network nodes. Similarly, the proposed IETF OPES WG is attempting to enable the interposition of a processing model into arbitrary protocols by the use of an XML-based ruleset that is separate from the messages that it modifes.
The existence of these efforts illustrates the validity of a wide variety of approaches; some XML processors will be invoked and controlled by in-document hints (as XML Protocol outlines), or by external information (OPES), or again by associated metadata (ESI Architecture). We feel that an XML Processing model would benefit from considering all of these dimensions and use cases.
In this section, we briefly outline a framework for processing that leverages current Web mechanisms extensively. At its core is a processor definition, which enumerates the characteristics and variabilities of a processor, processor metadata, which controls its invocation and behaviour, and metadata sourcing, which enables different means of associating that metadata with a particular document.
A processor can be simply defined as a collection of inputs and outputs; a particular type of processor may place restrictions on the numbers and types of these, and a particular instance of a processor type may have further refinements.
We believe that both the inputs and outputs of a processor should be identifiable (but not necessarily locatable) by a URI. Not only does this enable distributed processing, but it also maintains the URI's status as the keystone of the Web.
The behaviour of a processor must be controllable in a flexible and predicable manner. To facilitate this, we propose that there be a taxonomy of metadata defined, with such features as;
Additionally, individual processor definitions might nominiate metadata unique to them.
As outlined above, we believe that there are a number of potential sources for directives that will affect processors. To allow this, we define separate serializations of processor metadata into a number of contexts;
Because there are multiple forms of sourcing available, precedence of the mechanisms needs to be defined; we suggest that the order above be used, with the first (in-document sourcing) maintaining highest precedence over the others.
This paper is an unbashed attempt to visualise an XML Processing Model through the filter of a Distributed Processing Model's requirements. Whilst it may be an imperfect fit in some aspects, we hope to spur consideration of these aspects and use cases when designing a processing model, and to identify similarities between the issues of XML processing and Distributed Web Processing.
revision 1.01