Unifying XSLT Extensions
March 29, 2000
This week XML-Deviant focuses on the discussion surrounding the limitations of XSLT in practical use. The debate has crossed several forums, including both XML-DEV and the XSL developers mailing list.
What's the Problem?
Use any programming language for long enough and you'll quickly learn both its features and its limitations. XML developers working with XSLT are going through this learning process, and are beginning to identify some limitations.
XSLT provides a well-defined extension mechanism, allowing developers to include custom processing in their style sheets. Steve Muench observed that the specification clearly defines the means to manage extension functions:
The XSLT 1.0 Spec is also careful to provide the necessary mechanisms to built robust, portable XSLT style sheets, even in the face of cool extensions.
The <xsl:fallback> and element-available('qname') and function-available('qname') let you have good, proactive control as to what should happen if an extension on which you depend is not available.
So the good news is that it's possible to extend the features of XSLT, while still ensuring that a style sheet is fail-safe. What the specification lacks is any information on how these extensions should be implemented. Muench commented on the lack of language bindings in the specification:
The spec allows for functions that return the built-in types in the xpath data model, but doesn't say anything about language bindings, so a question a Java developer might ask is:
"What should my Java extension function return in order to return an XPath node-set to my style sheet?"
The answer to that question is dependent on the style sheet processor the developer is using. Each processor offers a proprietary API against which extension functions can be developed. Because these APIs are not standardized, extension functions are not portable between processors.
Each processor also offers a different selection of extension functions. Eugene Kaganovich observed that XSLT developers select processors based on their extensions:
The current situation is, people have to pick and switch between XSL[T] processors depending on which extensions they use. Yuck! Wouldn't it be great, instead, to be able to download a package that implements some function, put it in your classpath, and it's ready to work?
The problem is therefore two-fold. Extension function authors must port their code to multiple platforms, there being no standard API. XSLT developers using extension functions are tying their style sheets to individual processors. Checking for a processor at runtime and using alternative transformations increases the complexity of their style sheets. All of which adds another point of failure.
What Should Be Done?
David Megginson posted a message to XML-DEV illustrating the problem, selecting one of the most commonly used XSLT extensions: generating multiple output documents from a single transform. This is used in many HTML transformations, allowing a single XML document to generate multiple web pages. An example is a "phone number list," which could be generated with slightly different views on the data (e.g., by surname, by department, etc.).
Megginson suggested that effort be made to resolve the problem, and drew parallels with the birth of SAX:
This looks like a much smaller and simpler variant of the same kind of problem that brought about SAX. How's about a couple of XSL software writers or users getting together and writing a friendly one-page document defining an extension element for multiple output documents, and then publishing it as a W3C Note through an obliging vendor?
Rather than leap in immediately, Sebastian Rahtz thought it might be prudent to wait for the next version of the specification.
[S]ince the feature is clearly flagged in the appendix to XSLT which lists "desirable features for the next release"... surely the correct way forward is for the W3C to start work on XSLT 2.0 or 1.1?
Rahtz's position was that efforts to produce a W3C Note to standardize these functions might be wasted. Edd Dumbill disagreed:
I don't believe a Note would be wasted. It would be a natural point of input for an XSLT WG. Agreement in the interim about this issue would be a good thing for XSLT sheet interoperability.
David Megginson expressed similar views, and believed that standardizing XSLT extensions should be achieved as quickly as possible:
Waiting for XSLT v.2 is The Right Thing from a long-term perspective, but semi-standardizing two or three of the most common extensions would buy a lot right now and wouldn't hamstring anyone waiting for the W3C's schedule (and then waiting for XSL implementors to get around to supporting v.2, and then waiting for users to upgrade their browsers, and then...).
Megginson also suggested that OpenMarkup.org might be a suitable means by which to co-ordinate the activity. OpenMarkup.org is an idea first presented by Don Park at XTech2000, the intention being to provide an open forum for peer review, discussion, and "rubber-stamping" of open standards.
Park has already indicated that standardizing XSLT extensions might make a suitable initial project. He echoed David Megginson's suggestion to start small, and expressed a desire to see more smaller standardization efforts:
I would love to see custom function support be standardized but I am more interest[ed] in having this sort of activity become the norm, so lets start small and add more later. Like General Patton's infamous Rock Soup.
It remains to be seen whether OpenMarkup.org can be a success, but it's an intriguing idea that's worth exploring further.
Getting Started - a Roadmap
So what happens next? The problem seems to be clearly defined: XSLT doesn't have a standard API against which extension functions can be implemented. Contributors to both XSL-List and XML-DEV have suggested steps that need to be taken. Here's my attempt to compile a roadmap for the work that needs to be done:
1. Define a standard Namespace
All XSLT extension functions must be declared within a namespace. This imposes the requirement for a single standard namespace, which ideally would be vendor-neutral. As with SAX, XML.org might "own" this for the community.
2. Define language binding(s)
Define the API against which extension function authors will build their code, and which XSLT processors would support. This needn't (and shouldn't) be hugely complex in practice. XSLT has only four basic types, and most functions will be returning only one of those (the "node-set").
3. Implement standard extension functions
Start developing some standardized extension functions. The "SAX model" (using a single maintainer for the library) may work well in the early stages of development. With appropriate processor support for the language bindings, these functions would be portable across processors.
4. Submit the results as a standard
Given that two of the most popular XSL processors (XT and Xalan) are open source, adding support for a new standard shouldn't be impossible. For true interoperability to be achieved, however, vendors will need to commit to its support. While a W3C Note may carry more clout, a "thumbs-up" from OpenMarkup.org might be enough for most developers to start with.
Let's hope we can start taking steps down this road as soon as possible.