Understanding the node-set() Function
July 16, 2003
The XSLT language is capable of achieving many tasks, but some surprisingly trivial
requirements, such as calculating the total amount of an invoice, cannot be expressed
in a
straightforward way. This article describes how you can get round this by using a
very
powerful extension function in your stylesheets: the node-set()
function.
In XSLT you can assign to a variable any XPath data type. For example, to store all books from a catalog in a variable for further processing, you can use the following instruction:
<xsl:variable name="books" select="//book"/>
The variable $books
now contains a set of nodes. Thus you can use this
variable in other XPath expressions without any limitations. For example, you can
use the
expression $books/title
to get the titles of all books from the catalog.
So far, so good, but XSLT added a new data type called "result tree fragment"
into XPath. You can imagine a result tree fragment (RTF) as a fragment or a chunk
of XML
code. You can assign a result tree fragment to a variable directly, or result tree
fragment
can arise from applying templates or other XSLT instructions. The following code assigns
a
simple fragment of XML to the variable $author
.
<xsl:variable name="author"> <firstname>Jirka</firstname> <surname>Kosek</surname> <email>jirka@kosek.cz</email> </xsl:variable>
Now let's say we want to extract the e-mail address from the $author
variable.
The most obvious way is to use an expression such as $author/email
. But this
will fail, as you can't apply XPath navigation to a variable of the type "result tree
fragment."
If we want to get around this limitation, we can use an extension function which is
able to
convert a result tree fragment back to a node-set. This function is not a part of
the XSLT
or XPath standards; thus, stylesheets which use it will not be as portable as ones
which
don't. However, the advantages of node-set()
usually outweigh portability
issues.
Extension functions always reside in a separate namespace. In order to use them we
must
declare this namespace as an extension namespace in our stylesheet. The namespace
in which
the node-set()
function is implemented is different for each processor, but
fortunately many processors also support EXSLT, so we can use the following declarations at the start of our stylesheet.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exsl="http://exslt.org/common" extension-element-prefixes="exsl" version="1.0"> ... <!-- Now we can convert result tree fragment back to node-set --> <xsl:value-of select="exsl:node-set($author)/email"/> ... </xsl:stylesheet>
The expression exsl:node-set($author)
converts the result tree fragment to a
node-set; we can take it as a start for further XPath navigation. If our processor
is not
EXSLT-aware we must change the namespace http://exslt.org/common
according to
Table 1.
Table 1. Support for node-set()
in XSLT processors
Processor | Function name | Namespace |
---|---|---|
EXSLT aware processors (Saxon, xsltproc, Xalan-J, jd.xslt, 4XSLT) | node-set() | http://exslt.org/common |
MSXML | node-set() | urn:schemas-microsoft-com:xslt |
Xalan-C | nodeset() | http://xml.apache.org/xalan |
Sablotron | Can operate on result tree fragments directly |
After this rather theoretical introduction, I will now show you how you can use
node-set()
for something more useful.
Sum of Products -- Invoice Processing
Let's suppose that we want to create a stylesheet that is able to render a simple XML invoice into nice HTML for further browsing and printing. For the sake of simplicity our invoice contains just items, each item has a description, ordered quantity and unit price.
<?xml version="1.0" encoding="utf-8"?> <invoice> <item> <description>Pilsner Beer</description> <qty>6</qty> <unitPrice>1.69</unitPrice> </item> <item> <description>Sausage</description> <qty>3</qty> <unitPrice>0.59</unitPrice> </item> <item> <description>Portable Barbecue</description> <qty>1</qty> <unitPrice>23.99</unitPrice> </item> <item> <description>Charcoal</description> <qty>2</qty> <unitPrice>1.19</unitPrice> </item> </invoice>
We don't want to be responsible for putting a damper on the party, so we will write
a
stylesheet for turning this XML into HTML. However, there is one complication: the
rendered
invoice should certainly contain the total amount. This might look like a simple task,
but
it will quickly become apparent that XPath and XSLT will fail here. XPath provides
us with
the sum()
function, but it is only possible to sum values of nodes, and in our
example we want to calculate a sum of subtotals (qty * unitPrice
), which are
not present in the source XML and thus are not accessible to XPath's sum()
. The
only pure XSLT 1.0 solution is to use recursive processing, which leads to code that
is not
very clear and easy to understand. (A pure XSLT solution is presented in
invoice-noext.xsl
stylesheet in the ZIP archive with all
examples.)
The whole task will be much easier if we decide to utilize the node-set()
function. In the first step we calculate subtotals for each item and store them as
a
fragment of XML.
<xsl:variable name="subTotals"> <xsl:for-each select="invoice/item"> <number><xsl:value-of select="qty * unitPrice"/></number> </xsl:for-each> </xsl:variable>
The variable $subTotals
now holds subtotals, where each subtotal is marked-up
with a number
element.
<number>10.14</number> <number>1.77</number> <number>23.99</number> <number>2.38</number>
Now we can get the total invoice amount quite easily by summing up values stored in
number
nodes: sum(exsl:node-set($subTotals)/number)
.
Here is a complete working stylesheet.
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exsl="http://exslt.org/common" extension-element-prefixes="exsl" version="1.0"> <xsl:template match="/"> <html> <head> <title>Invoice</title> </head> <body> <h1>Invoice</h1> <!-- Format invoice items as a table --> <table border="1" style="text-align: center"> <tr> <th>Description</th> <th>Quantity</th> <th>Unit price</th> <th>Subtotal</th> </tr> <xsl:for-each select="invoice/item"> <tr> <td><xsl:value-of select="description"/></td> <td><xsl:value-of select="qty"/></td> <td><xsl:value-of select="unitPrice"/></td> <td><xsl:value-of select="qty * unitPrice"/></td> </tr> </xsl:for-each> <tr> <th colspan="3">Total</th> <th> <!-- Gather subtotals into variable --> <xsl:variable name="subTotals"> <xsl:for-each select="invoice/item"> <number> <xsl:value-of select="qty * unitPrice"/> </number> </xsl:for-each> </xsl:variable> <!-- Sum subtotals stored as a result tree fragment in the variable --> <xsl:value-of select="sum(exsl:node-set($subTotals)/number)"/> </th> </tr> </table> </body> </html> </xsl:template> </xsl:stylesheet>
Multipass Processing
Multipass processing is another situation in which the node-set()
function is
essential. In some situations it is hard to do the transformation in a single step;
some
post-processing on the result is needed. If we want to do this during a single
transformation without the need for storing a temporary result, and without the need
for
repeated invocation of the XSLT processor, we must capture the result of the first
transformation in a variable as a result tree fragment (RTF), convert the RTF to a
node-set,
and feed this node-set to templates which are responsible for post-processing.
We can demonstrate this technique on a very simple but real problem. Suppose that we must change an existing stylesheet to display a small image before each external link, in order to inform the user that an Internet connection is needed to traverse the link. The conventional approach to solving this task is to change the existing stylesheet to emit icons in appropriate places. But in the case of a very complex stylesheet this can be very time consuming work.
Our approach will give up on modifying the existing stylesheet. Instead we will capture
its
output and we will modify links in the captured output. In order to capture the output
of
other stylesheets we must import the stylesheet, and in the template for root node
we must
invoke the original templates using xsl:apply-imports
inside a variable
definition.
<xsl:variable name="content"> <xsl:apply-imports/> </xsl:variable>
The variable $content
now holds the complete output from the original
stylesheet. In this output we must change all occurrences of external links such as:
<a href="http...">text</a>
to
<a href="http..."><img src="external.gif" width="16" height="16" border="0"> text</a>
All other text and markup should retain untouched. To copy the XML tree without modification we can use a very simple template that copies all element, attribute and text nodes.
<xsl:template match="@*|*|text()"> <xsl:copy> <xsl:apply-templates select="@*|*|text()"/> </xsl:copy> </xsl:template>
A second template is needed to process external links in a different way. As this template will match against a named element it has a higher priority than the previous copy-only template and will override it.
<xsl:template match="a[starts-with(@href,'http')]"> <xsl:copy> <xsl:apply-templates select="@*"/> <img src="external.gif" width="16" height="16" border="0"/> <xsl:text> #160;</xsl:text> <xsl:apply-templates select="*|text()"/> </xsl:copy> </xsl:template>
Note that we must copy the original attributes for element a
before inserting
the image, otherwise the attributes will be appended to the wrong place.
In the final stylesheet, former templates must be in their own mode to prevent conflicts with the original stylesheet. To show that there are real situations where you don't have enough time to get to grips with other's work I'm using the DocBook XSL stylesheets as my original stylesheet. You can process any valid DocBook document with our final stylesheet.
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exsl="http://exslt.org/common" extension-element-prefixes="exsl" version="1.0"> <!-- Import original stylesheet --> <xsl:import href="http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl"/> <xsl:template match="/"> <!-- Grab result of original stylesheet --> <xsl:variable name="content"> <xsl:apply-imports/> </xsl:variable> <!-- Pass grabbed content to postprocessing templates --> <xsl:apply-templates select="exsl:node-set($content)" mode="decoratelinks"/> </xsl:template> <!-- Default postprocessing is just copying of nodes --> <xsl:template match="@*|*|text()" mode="decoratelinks"> <xsl:copy> <xsl:apply-templates select="@*|*|text()" mode="decoratelinks"/> </xsl:copy> </xsl:template> <!-- Absolute links starting with "http" are external and we must add icon to them --> <xsl:template match="a[starts-with(@href,'http')]" mode="decoratelinks"> <xsl:copy> <!-- Copy original <a> attributes --> <xsl:apply-templates select="@*" mode="decoratelinks"/> <!-- Insert image --> <img src="external.gif" width="16" height="16" border="0"/> <xsl:text> #160;</xsl:text> <!-- Copy content (subelements and text nodes) of <a> --> <xsl:apply-templates select="*|text()" mode="decoratelinks"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
The Future of the node-set()
Function
XSLT 2.0 and XPath 2.0 are slowly progressing toward W3C Recommendation. You might
be
wondering whether the node-set()
function will be part of these standards. The
answer is no, but don't worry. The authors of XSLT 2.0 made an important decision:
result
tree fragments are gone. There will be no need to use the node-set()
function
in XSLT 2.0 as you can operate directly on XML fragments stored in a variable, as
on any
other node-set. Regardless, you should put the node-set()
function in your bag
of tools as it will take several years before XSLT 2.0 will be deployed as widely
as XSLT
1.0 is deployed today.
Related links
|