Transclusion with XSLT 2.0
July 9, 2003
Transclusion is a hypertext concept that began in the work of Ted Nelson, who coined the term "hypertext". Roughly speaking, transclusion is the inclusion of a resource, or part of a resource, potentially from anywhere in the world, within a new one. For example, the HTML img element is a form of transclusion. Nelson envisioned dynamic compound documents consisting entirely of pointers to pieces of other documents, with the compound ones automatically reflecting updates to the transcluded pieces. As he wrote in his book "Literary Machines" 93.1: "Transclusion will be a fundamental service of tomorrow's literary computers, and a property of the documents they will supply. Transclusion means that part of a document may be in several places -- in other documents beside the original -- without actually being copied there."
Of course, at some level, information from one server must be copied from the transcluded document to show up on the screen of the user viewing the transcluding document; but if the copy happens at read time, every time, a compound document will still have the dynamic nature that Nelson envisioned.
Has transclusion been implemented in any widely-used web technology? Transclusion-like capabilities are specified in the XInclude Candidate Recommendation, but this spec tells us that "Simple information item inclusion as described in this specification differs from transclusion, which preserves contextual information such as style." Even then, no XInclude implementation that I know of allows the inclusion of portions of other documents; utilities such as XIncluder only allow for the transclusion of entire documents. The XLink actuate="onLoad"/show="embed" combination describes something similar, but to my knowledge browser support for it is nonexistent.
Before I started playing with XSLT 2.0, I tried to implement some sort of transclusion with XSLT 1.0. For example, when processing a document like this,
<chapter> <title>doc1.xml</title> <p>First paragraph</p> <transclusion src="doc2.xml"/> <p>Last paragraph</p> </chapter>
with a stylesheet like this,
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="transclusion"> <xsl:copy-of select="document(@src)"/> </xsl:template> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
doc2.xml gets inserted between the two p elements in the result just fine. But what if I don't want all of doc2.xml? What if I want to provide an XPath expression showing which part of doc2.xml to transclude? My attempts to do so ran into a classic XSLT problem: any XPath expression that I tried to store in the source document and pass to the stylesheet was treated as a string, and not as the nodes it was supposed to represent, so the XSLT processor didn't know that it was an XPath expression.
The Simplified Stylesheet Modules feature of XSLT 2.0 gave me an idea for a new approach. It essentially lets you embed an XSLT template rule in a document. (Technically, the whole thing is considered a stylesheet with a literal result element as the document element.) You're supposed to pass this "stylesheet" along with an input document to the XSLT processor, but if the embedded template rule calls XSLT 1.0's document() function and passes it the name of the document to transclude, the input document passed to the XSLT processor can be a dummy document. For such purposes, I just use the stylesheet itself.
Let's look at an example. The following shows the doc2.xml document, and all I want to transclude into doc1.xml is the second paragraph.
<doc> <p id="p1">here is p1 of doc2.xml</p> <p id="p2">here is p2 of doc2.xml</p> <p id="p3">here is p3 of doc2.xml</p> </doc>
Here is the revised version of doc1.xml.
<chapter xsl:version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <title>doc1.xml</title> <p>First paragraph</p> <xsl:copy-of select="document('doc2.xml')/doc/p[@id='p2']"/> <p>Last paragraph</p> </chapter>
There are two big changes:
- Instead of making up a transclusion element to be transformed by a separate stylesheet, as I did before, I have an xsl:copy-of element right in the document to name the content to transclude.
- Because an XSLT 2.0 processor considers this document to be an actual stylesheet that just happens to have a literal result element as its root, I need to declare the XSLT namespace and name the xsl:version in the document's root element.
Running the stylesheet with an XSLT 2.0 processor (for now, the only one I know of is the 7.* development branch of Michael Kay's Saxon XSLT processor) inserts the p2 paragraph from doc2.xml into the result copy of the "stylesheet" document:
<?xml version="1.0" encoding="UTF-8"?> <!-- whitespace added for easier reading --;> <chapter> <title>doc1.xml</title> <p>First paragraph</p> <p id="p2">here is p2 of doc2.xml</p> <p>Last paragraph</p> </chapter>
Although I used the stylesheet itself as the input document, I'd get the same result with any other input, because an XSLT 2.0 processor reading a simplified stylesheet module outputs the stylesheet's literal result element used as its document element as soon as it sees the root of any document passed as input.
If I hadn't added the XPath expression "/doc/p[@id='p2']" to the xsl:copy-of element's select attribute, the whole doc2.xml document would have been inserted. The fact that I could specify a specific part of doc2.xml to transclude is what makes this so cool. This is where it goes beyond the capability of XSLT 1.0 and the XInclude implementations that I know of. Imagine the possibilities of a browser that implemented this XSLT 2.0 feature; you could create compound documents much closer to what Nelson described by combining pieces of live documents instead of combining pasted copies of them that only got updated when you repasted them.
Transcluding Plain Text
Another XSLT 2.0 feature lets you transclude plain text files. According to the latest XSLT 2.0 Working Draft, the unparsed-text() function "reads an external resource (for example, a file) and returns its contents as a string." The only way to read external documents in XSLT 1.0 was the document() function that we saw above, which could only read well-formed XML documents, so this new function adds some flexibility.
For example, here is a dummy text file called rfc9999.txt:
RFC 9999 This is not a real IETF Request for Comment. It is a dummy one that I created to demonstrate XSLT 2.0's unparsed-text() function because RFCs are possible the most valuable plain text files on the Internet.
The following simplified stylesheet module is nearly identical to the one above except that it's called doc3.xml and has an xsl:value-of instruction that calls the unparsed-text() function instead of an xsl:copy-of instruction that calls the document() function. (The specification of the encoding in the function call is optional.)
<chapter xsl:version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <title>doc3.xml</title> <p>First paragraph</p> <xsl:value-of select="unparsed-text('RFC9999.txt','UTF-8')"/> <p>Last paragraph</p> </chapter>
Again, because the stylesheet is an XSLT 2.0 simplified stylesheet module, any XML file can be used as the input, so I used the stylesheet itself as input. The result looks like this:
<?xml version="1.0" encoding="UTF-8"?> <!-- some carriage returns added --;> <chapter> <title>doc3.xml</title> <p>First paragraph</p> RFC 9999 This is not a real IETF Request for Comment. It is a dummy one that I created to demonstrate XSLT 2.0's unparsed-text() function because RFCs are possible the most valuable plain text files on the Internet. <p>Last paragraph</p></chapter>
Also in Transforming XML |
|
The plain text file got inserted between the two p elements of the result copy of the doc3.xml document.
Conclusion
When the XSLT processors built-in the major browsers implement XSLT 2.0 as they now implement 1.0, we'll have a lot of great new possibilities to work with. The ability to include a remote document with no explicit permission from the document's owner does bring up certain security issues, which is why Mozilla's XSLT engine has certain restrictions on its use of the XSLT 1.0 document() function. Still, the ability to mix and match new and old XSLT features to realize a little bit more of Ted Nelson's original hypertext vision is a lot of fun, and points the way toward even more powerful XSLT-based applications in the future.