XSLT Reflection
November 5, 2003
Many modern programming languages contain a special interface called reflection. Reflection can be used to programmatically read, modifying, and create code in a particular language. Because the main purpose of XSLT is to transform XML documents, and because a XSLT stylesheet is expressed in the XML syntax, we can use XSLT to manipulate stylesheets themselves. In the following article I'm going to show you how useful XSLT reflection can be.
Reading XSLT Code
The most fundamental reflection task is to read code. This is very easy in XSLT. You
can
query an XSLT stylesheet like any other XML document, the only thing you must not
forget is
to specify the correct XSLT namespace for all queried elements. The following examples
assume that the prefix xsl
is bound to a XSLT namespace
(http://www.w3.org/1999/XSL/Transform
).
We can query the source document, which is in fact another XSLT stylesheet, as any other XML document. For example, to get the total number of templates in the stylesheet, we can use:
Number of templates: <xsl:value-of select="count(//xsl:template)"/>
We can also create templates that match elements in the source XSLT stylesheet. For example, to get statistics about keys defined in a particular stylesheet, we can use the following template:
<xsl:template match="xsl:key"> Key name: <xsl:value-of select="@name"/> Matches: <xsl:value-of select="@match"/> Use expr: <xsl:value-of select="@use"/> </xsl:template>
We can even access the contents of the currently processed stylesheet by placing the
call
to document('')
function at the beginning of XPath expression.
Reading XSLT code from a XSLT stylesheet is not tricky. However, that's not true for generating XSLT code. We cannot generate XSLT elements in templates directly, as this will confuse the XSLT processor. It cannot recognize which element should be considered an instruction, controlling transformation flow, and which element should be just copied to the output.
One way of overcoming this issue is to use the xsl:element
instruction to emit
all elements in the generated XSLT stylesheet. This approach is a little inconvenient;
in
order to create a simple stylesheet like
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> Hello! This text was created by an automatically generated stylesheet. </xsl:template> </xsl:stylesheet>
you have to use a rather verbose stylesheet, like
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:element name="xsl:stylesheet"> <xsl:attribute name="version">1.0</xsl:attribute> <xsl:element name="xsl:template"> <xsl:attribute name="match">/</xsl:attribute> Hello! This text was created by an automatically generated stylesheet. </xsl:element> </xsl:element> </xsl:template> </xsl:stylesheet>
The second approach is easier, once you know how to use the
xsl:namespace-alias
instruction. This instruction allows us to remap
namespaces after a document is transformed, and thus we can generate XSLT elements
directly
using temporary namespace.
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xslo="http://www.w3.org/1999/XSL/TransformAlias" version="1.0"> <xsl:namespace-alias stylesheet-prefix="xslo" result-prefix="xsl"/> <xsl:template match="/"> <xslo:stylesheet version="1.0"> <xslo:template match="/"> Hello! This text was created by an automatically generated stylesheet. </xslo:template> </xslo:stylesheet> </xsl:template> </xsl:stylesheet>
You can read more about this namespace aliasing technique in the article Namespaces and XSLT Stylesheets by Bob DuCharme, or you can read the corresponding section in the XSLT recommendation.
Now that we have learned how to read, query, and write XSLT stylesheets using XSLT, we can utilize our knowledge to do something really useful.
Convert HTML Stylesheets to XHTML Stylesheets
An obvious use of XSLT reflection is to refactor existing stylesheets. Suppose we have a large base of stylesheets that should be changed in a way that can be algorithmically captured. For example, we may want to modify existing HTML stylesheets to produce XHTML. In the case of only one stylesheet such rewriting will be done by hand. But if there is more than one stylesheet, the XHTML stylesheet should be automatically derived from the HTML one. Such an automatic derivation can be expressed in a form of the XSLT transformation.
Let's summarize the main differences between HTML and XHTML:
-
all XHTML elements belong to the namespace
http://www.w3.org/1999/xhtml
-
XHTML is an XML language, not an SGML one
-
XHTML uses different public and system identifiers in the DOCTYPE declaration
Now we must express these changes as changes in the XSLT stylesheet. The first change means that for each non-namespaced element in the original HTML stylesheet, we must add the correct namespace. All other elements should be copied intact. The following template can accomplish this change for us:
<xsl:template match="*"> <xsl:choose> <!-- When the element is not in a namespace, then it is HTML element which should be transformed into a XHTML element in a proper namespace --> <xsl:when test="namespace-uri(.) = ''"> <xsl:element name="{local-name(.)}" namespace="http://www.w3.org/1999/xhtml"> <!-- Copy through attributes --> <xsl:copy-of select="@*"/> <!-- Process content of the element --> <xsl:apply-templates/> </xsl:element> </xsl:when> <!-- Other elements (mostly XSLT instructions) are copied through --> <xsl:otherwise> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:otherwise> </xsl:choose> </xsl:template>
However this doesn't correct namespace for elements produced by the
xsl:element
instruction. Therefore, another template is needed.
<xsl:template match="xsl:element"> <!-- Copy xsl:element instruction --> <xsl:copy> <!-- Copy original attributes --> <xsl:copy-of select="@*"/> <!-- Add element to the right namespace --> <xsl:attribute name="namespace">http://www.w3.org/1999/xhtml</xsl:attribute> <!-- Process content of the instruction --> <xsl:apply-templates/> </xsl:copy> </xsl:template>
To complete it, we must also copy the possible content of elements like text, comments and processing instructions.
<xsl:template match="comment()|processing-instruction()|text()"> <xsl:copy/> </xsl:template>
The second thing to do is to change output method from HTML to XML, and also output
the
correct public and system identifiers for XHTML. This behavior is controlled by the
xsl:output
instruction. The following template processes this job.
<xsl:template match="xsl:output"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:attribute name="method">xml</xsl:attribute> <xsl:attribute name="encoding">UTF-8</xsl:attribute> <xsl:attribute name="doctype-public"> -//W3C//DTD XHTML 1.0 Transitional//EN </xsl:attribute> <xsl:attribute name="doctype-system"> http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd </xsl:attribute> </xsl:copy> </xsl:template>
To handle the cases where xsl:output
is missing, we should test it and create
a new xsl:output
instruction in the output XHTML stylesheet.
It seems that the stylesheet is now ready to convert HTML stylesheets into XHTML ones. But if we try it, we find that generated stylesheets contain a lot of default namespace declarations in the following form:
<someelement xmlns="http://www.w3.org/1999/xhtml">
Even worse, this declaration is usually repeated in HTML generated by this stylesheet. This is not an error, but the page is gratuitously long and it can cause problems for some older browsers.
We can get rid of these superfluous declarations by declaring a XHTML namespace as a default namespace for the root element of the stylesheet. This sounds easy, but XSLT doesn't offer any standard way for creating such declarations. In the most widely used processors, like Saxon and xsltproc, the following trick works. We can create an element in a XHTML namespace and store it in a variable. From this variable we can copy just the namespace axis to the root element, and we will get the corresponding default namespace declaration here.
<xsl:template match="xsl:stylesheet" > <!-- Store a temporary element from a XHTML namespace in the variable --> <xsl:variable name="temp"> <xsl:element name="dummy" namespace="http://www.w3.org/1999/xhtml"/> </xsl:variable> <!-- Copy xsl:stylesheet element --> <xsl:copy> <!-- Copy just the namespace declarations from a dummy element --> <xsl:copy-of select="exsl:node-set($temp)//namespace::*"/> <!-- Copy original xsl:stylesheets attributes --> <xsl:copy-of select="@*"/> <!-- Process the content of the original stylesheet --> <xsl:apply-templates/> </xsl:template>
The complete stylesheet is a part of sample files. You can use it to convert almost any HTML stylesheet to a stylesheet that produces XHTML. The same method is used in DocBook XSL stylesheets to produce the XHTML version of stylesheets from the HTML version, which is the version on which real human development is done.
Localization Without Performance Loss
XSLT is often used to create a web site from XML sources. Many organizations today need multilingual websites. The common approach in creating such sites with XSLT is to store locale dependent messages in a special XML file known as the message catalog. Every time we need to display a language dependent text, we call a special template with parameters identifying the current language and the requested text. So instead of simply typing
<h1>Welcome!</h1>
in the non-localized stylesheet, we call template which returns correct welcome text in the desired language.
<h1> <xsl:call-template name="gentext"> <xsl:with-param name="text">Welcome</xsl:with-param> <xsl:with-param name="lang" select="$currentLang"/> </xsl:call-template> </h1>
These gentext
templates usually utilize the document()
function
to lookup the desired text in an external message catalog.
This solution has two big drawbacks. Typing a long code for calling a template is very inconvenient, especially in comparison with writing a non-localized stylesheet. The second drawback is poor performance. A stylesheet must repeatedly look for messages in the message catalog during each server request.
Both of these drawbacks can be easily overcome using a simple solution. We will not create a real stylesheet, but just a stylesheet template that can be later transformed into specialized stylesheets for each language. The stylesheet template will be a real XSLT stylesheet, which will use elements from a special namespace instead of the text constants. For example the heading with the welcome text will be written as
<h1><msg:Welcome/></h1>
Such a template for a XSLT stylesheet can be merged with message catalog for each
language,
elements from the msg
namespace will be replaced by a localized text, and we
will get the XSLT stylesheet for each supported language. Such stylesheets contain
all
localized text directly, so there is no performance cost. Writing template stylesheets
is
also much easier then the common solution using localization templates invoked at
runtime.
Our solution is better overall, only its management requirements are higher. We must process the stylesheet template into a real localized one every time a change is made to the template or to the message catalog. This transformation can be of course expressed as XSLT transformation and the whole process should be automated by Makefile, Ant task, or batch file.
Figure 1. From the stylesheet template to localized stylesheets |
Let's explore the proposed solution deeper. Message catalogs are simple XML documents.
For
each language there is one such a file named after ISO language code (e.g.
en.xml
for English, cs.xml
for Czech). The sample catalog looks
like this:
<?xml version="1.0" encoding="utf-8"?> <l lang="en"> <text key="Invoice">Invoice</text> <text key="Welcome">Hello and Welcome!</text> <text key="Description">Description</text> <text key="Quantity">Quantity</text> <text key="UnitPrice">Unit price</text> <text key="Subtotal">Subtotal</text> <text key="Total">Total</text> </l>
Note that key names correspond to local names of the msg:*
elements in the
template stylesheet.
Now we need a transformation to replace all occurrences of the msg:*
elements
with the corresponding texts from the catalog. This can be easily expressed by the
following
stylesheet. It copies all stylesheet parts to output unmodified except the
msg:*
elements.
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msg="urn:x-kosek:schemas:messages:1.0" exclude-result-prefixes="msg" version="1.0"> <xsl:output method="xml"/> <xsl:param name="lang">en</xsl:param> <xsl:param name="messages" select="document(concat($lang, '.xml'))/l"/> <!-- Copy stylesheet untouched --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <!-- Replace msg:* elements with corresponding entry from message catalog --> <xsl:template match="msg:*" priority="1"> <xsl:value-of select="$messages/text[@key = local-name(current())]"/> </xsl:template> </xsl:stylesheet>
You can download the sample stylesheet template with message catalogs and other files.
Conclusion
This article showed two real world advantages of XSLT being expressed in the XML syntax. This allows authors of stylesheets to manipulate with the XSLT code directly from their stylesheet and utilize it for various interesting effects. This functionality of XSLT is very similar to the concept of reflection known from other programming languages.
ResourcesXML.com's article about Namespaces and XSLT Stylesheets. Literate Programming in XML by Norman Walsh shows another very interesting usage of XSLT. XSL-List – open forum on XSL. |