Finding the First, Last, Biggest, Smallest
August 7, 2002
Sometimes you want to know which element or record is first or last in a given set,
or you
want to know which element or record has a value that is the greatest or smallest
among the
corresponding values in that set -- for example, which employee
element has the
lowest value for a hireDate
subelement or attribute. These operations are
typically performed by a query language. You don't need a separate query language,
however,
to do these when you're developing with XSLT. If you can describe a set of nodes from
a
document with a single-step XPath expression, then you can get the first of those
nodes by adding a predicate of [1]
to that expression, and you can find out the
last one by adding a predicate of [last()]
. To get an element or attribute
value with the greatest or smallest value in it, you can sort the nodes using any
of the
sorting options that we saw in last month's column and then use the same predicates to pick out the one at either
end.
To demonstrate, we'll pull out different titles from the following chapter
document. (All stylesheets, sample input, and sample output can be found in this zip file.) Note the nesting structure
of the chapter
element and its descendants that begin with a title
element.
<chapter><title>"Paradise Lost" Excerpt</title> <para>Then with expanded wings he steers his flight</para> <figure><title>"Incumbent on the Dusky Air"</title> <graphic fileref="pic1.jpg"/></figure> <para>Aloft, incumbent on the dusky Air</para> <sect1> <para>That felt unusual weight, till on dry Land</para> <figure><title>"He Lights"</title> <graphic fileref="pic2.jpg"/></figure> <para>He lights, if it were Land that ever burned</para> <sect2> <para>With solid, as the Lake with liquid fire</para> <figure><title>"The Lake with Liquid Fire"</title> <graphic fileref="pic3.jpg"/></figure> </sect2> </sect1> </chapter>
The following template lists the first and last title
elements in the
chapter
document by adding the [1]
and [last()]
predicates to the XPath expression descendant::title
, which contains all of the
title
elements within the chapter
element whether they're
children, grandchildren, or descendants of the grandchildren.
<!-- xq358.xsl: converts xq357.xml into xq359.txt --> <xsl:template match="chapter"> First title in chapter: <xsl:value-of select="descendant::title[1]"/> Last title in chapter: <xsl:value-of select="descendant::title[last()]"/> </xsl:template>
Although the first title
element in the chapter is a child of the
chapter
element and the last title
element is a
great-great-grandchild (being a grandchild of the sect2
element, which is a
grandchild of chapter
), the template rule finds them and adds their contents to
the result tree:
First title in chapter: "Paradise Lost" Excerpt Last title in chapter: "The Lake with Liquid Fire"
Why doesn't this work with a multi-step XPath expression? Because a predicate in an
XPath
location step is only applied to the nodes in that location step. For example, let's
say we
want the last title
of the last figure
element in the
chapter
shown above. The XPath expression in the following template won't do
it.
<!-- xq360.xsl: converts xq357.xml into xq361.txt --> <xsl:template match="chapter"> Last figure title in chapter? <xsl:value-of select="descendant::figure/title[last()]"/> No. </xsl:template>
The [last()]
predicate here isn't asking for the last figure title in the
chapter; it's looking for the last title
element within each
figure
element. Each of those figure
elements only has one
title
, so the expression returns a node list of all those figure
elements' title
elements. When the xsl:value-of
instruction
converts a node list to a text node for the result tree, it only gets the first one,
so we
see the first figure
element's title
element in the result:
Last figure title in chapter? "Incumbent on the Dusky Air" No.
What if we really do want the title
of the last figure
element in
the chapter? The secret to getting the first or last node of a node list described
by a more
complex XPath expression is to have an xsl:for-each
instruction get the list of
nodes in question and to then get the last (or first) one in that list.
For example, the following template rule has an xsl:for-each
instruction going
through the title
elements of all the figure
elements descended
from the context node. As it goes through them, one xsl:if
element checks
whether each node is the first in this list, and if so, it adds a message about this
to the
result tree. A second xsl:if
element does the same for the last node in the
list.
<!-- xq362.xsl: converts xq357.xml into xq363.txt --> <xsl:template match="chapter"> <xsl:for-each select="descendant::figure/title"> <xsl:if test="position() = 1"> First figure title in chapter: <xsl:value-of select="."/> </xsl:if> <xsl:if test="position() = last()"> Last figure title in chapter: <xsl:value-of select="."/> </xsl:if> </xsl:for-each> </xsl:template>
The result shows just what we wanted: the title
of the first
figure
element in the document and the title
of the last
figure
element in the document.
First figure title in chapter: "Incumbent on the Dusky Air" Last figure title in chapter: "The Lake with Liquid Fire"
What if we wanted the figure titles that were the first and last alphabetically? We
simply
add an xsl:sort
instruction inside the xsl:for-each
element.
<!-- xq364.xsl: converts xq357.xml into xq365.txt --> <xsl:template match="chapter"> <xsl:for-each select="descendant::figure/title"> <xsl:sort/> <xsl:if test="position() = 1"> First figure title in chapter: <xsl:value-of select="."/> </xsl:if> <xsl:if test="position() = last()"> Last figure title in chapter: <xsl:value-of select="."/> </xsl:if> </xsl:for-each> </xsl:template>
The result shows the first and last entries from an alphabetically sorted list of figure titles.
First figure title in chapter: "He Lights" Last figure title in chapter: "The Lake with Liquid Fire"
Because the xsl:sort
instruction has no select
attribute to
identify a sort key, a default sort key of ".
" is used, which uses the
string-value of the current node—in this case, the nodes that the
xsl:for-each
element is counting through—as the sort key. (Last month's column
described the various attributes of xsl:sort
to help you control how the
sorting was performed and their default values; see also my book XSLT Quickly for this discussion of the use
of the xsl:sort
instruction.)
In addition to using the xsl:sort
instruction to find the first and last
values alphabetically, you can use it to find the first and last or greatest and smallest
values for any sort key. For example, let's say we want to know who has the highest
and
lowest salaries of all the employees in the following list.
<employees> <employee hireDate="04/23/1999"> <last>Hill</last> <first>Phil</first> <salary>100000</salary> </employee> <employee hireDate="09/01/1998"> <last>Herbert</last> <first>Johnny</first> <salary>95000</salary> </employee> <employee hireDate="08/20/2000"> <last>Hill</last> <first>Graham</first> <salary>89000</salary> </employee> </employees>
The following template rule sorts the employee
elements within the
employees
element by their salary
, with a data-type
attribute telling the XSLT processor to treat the salary
values as numbers and
not as strings. (Otherwise, a salary of "100000" would come before a salary of "89000".)
As
with the previous example, two xsl:if
elements add messages to the result for
the first and last nodes in the list that the xsl:for-each
instruction is
counting through.
<!-- xq367.xsl: converts xq366.xml into xq368.txt --> <xsl:template match="employees"> <xsl:for-each select="employee"> <xsl:sort select="salary" data-type="number"/> <xsl:if test="position() = 1"> Lowest salary: <xsl:apply-templates/> </xsl:if> <xsl:if test="position() = last()"> Highest salary: <xsl:apply-templates/> </xsl:if> </xsl:for-each> </xsl:template>
Because this list is sorted numerically by employee salary, the result tells us which employees have the lowest and highest salaries:
Lowest salary: Hill Graham 89000 Highest salary: Hill Phil 100000
Also in Transforming XML |
|
If the employees' salary figures were stored in an attribute instead of in an element,
finding the largest and smallest salary figures would be the same, except that the
template
would sort the employee
elements using the salary attribute value as a sort key
instead of the salary
child element.
Remember, for anything you can sort on, you can always find the first or last values of the sorted list. This makes it easy to find the biggest, smallest, earliest, latest, or whatever values the first and last entries of that sorted list represent.