Declaring Keys and Performing Lookups
February 6, 2002
When you need to look up values based on some other value -- especially when your stylesheet needs to do it a lot -- XSLT's xsl:key instruction and key() function work together to make it easy. They can also make it fast. To really appreciate the use of keys in XSLT, however, let's first look at one way to solve this problem without them. Let's say we want to add information about the shirt elements in the following document to the result tree, with the color names instead of the color codes in the result.
<shirts> <colors> <color cid="c1">yellow</color> <color cid="c2">black</color> <color cid="c3">red</color> <color cid="c4">blue</color> <color cid="c5">purple</color> <color cid="c6">white</color> <color cid="c7">orange</color> <color cid="c7">green</color> </colors> <shirt colorCode="c4">oxford button-down</shirt> <shirt colorCode="c1">poly blend, straight collar</shirt> <shirt colorCode="c6">monogrammed, tab collar</shirt> </shirts>
We want the output to look like
blue oxford button-down yellow poly blend, straight collar white monogrammed, tab collar
The following stylesheet has an xsl:value-of instruction that uses an XPath expression to retrieve the contents of the colors element's appropriate color child. It does this by finding, for each shirt element, the color element whose cid attribute value matches the shirt element's color attribute value. (For example, it takes the color value of "c4" for the first shirt element and searches through the colors element's color children to find one with a cid attribute that has that same value: the one with "blue" as its contents.) Above that xsl:value-of element, an xsl:variable instruction sets the shirtColorCode variable equal to the shirt element's color attribute value, and the XPath expression has a predicate of [@cid = $shirtColorCode] to get only the color element whose cid attribute has the same value as the shirtColorCode variable.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="text"/> <xsl:template match="shirt"> <xsl:variable name="shirtColorCode" select="@colorCode"/> <xsl:value-of select="/shirts/colors/color[@cid = $shirtColorCode]"/> <xsl:text> </xsl:text><xsl:apply-templates/><xsl:text> </xsl:text> </xsl:template> <xsl:template match="color"/> </xsl:stylesheet>
This produces the desired output, but the complexity of the XPath expression means that if you have a lot of shirt elements whose colors need to be looked up, creating the result tree could go slowly. Declaring and using keys can make it go much faster, because an XSLT processor that sees that you've declared a key usually sets up an index in memory to speed these lookups. Doing it this way can produce the same result as the previous stylesheet much more efficiently.
The next stylesheet does the same thing as the previous one by using the xsl:key instruction to declare the nodes and values used for the color name lookups and the key() function to actually perform the lookups.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="text"/> <xsl:key name="colorNumKey" match="color" use="@cid"/> <xsl:template match="colors"/> <xsl:template match="shirt"> <xsl:value-of select="key('colorNumKey',@colorCode)"/> <xsl:text> </xsl:text><xsl:apply-templates/> </xsl:template> </xsl:stylesheet>
The xsl:key element has three attributes:
-
The name attribute holds the name of the lookup key. The key() function uses this name to identify what kind of lookup it's doing.
-
The match attribute holds a match pattern identifying the collection of nodes where the lookups will take place. In the example, the color elements are this collection. The fact that they are enclosed by a colors element gives the source document a little more structure, but it's not necessary for the key lookups to work.
-
The use attribute specifies the part or parts of the match attribute's collection of nodes that will be used to find the appropriate node -- in other words, it specifies the index of the lookup. In the example, this index is the cid attribute of the color elements, because a lookup will pass along a color ID string to look up the corresponding color name.
Using an xsl:key element and key() function.
The diagram above shows the four steps that take place for one particular lookup:
-
The xsl:value-of element for the shirt template has a key() function that says "pass the colorCode attribute value to the colorNumKey key to get this value".
-
For the oxford button-down shirt element, this value is "c4".
-
The colorNumKey element sends the XSLT processor to look for this value in the cid attributes of the color elements.
-
It finds it and returns the element's value for the xsl:value-of element to add to the result tree.
If these color IDs and names were in a table, you could think of the table as the "colorNumKey" lookup table, the nodes named by the match attribute as the rows of the table, and the value or values named by the use attribute as the index field (or fields) of the table.
These color elements would fit nicely into a table, but the beauty of doing this with XSLT (and XML) is that the elements named by your match attribute can have structures that are much more complex than any relational database table row. You have the full power of XML available, and the ability to use an XPath expression in the use attribute lets you identify any part of that structure you want to use as the lookup key.
The key() function performs the actual lookup. It takes a value, searches through the keys for one whose use value is equal to the one it's looking for, and returns the element or elements that have that key value. The example's template rule for the shirt elements calls this function to insert the color name before each shirt element's contents. The two arguments it passes to this function are the name of the key ("colorNumKey", the name of the lookup "table") and the value to use to look up the needed value: the shirt element's colorCode attribute value.
Because the key() function returns the node or nodes that the lookup found, you can use the function call as part of an XPath expression to pull an attribute value, subelement, or other subnode out of the returned node. For example, if the color elements had a PMSnum attribute, and you wanted to insert this attribute value instead of the color elements' actual content, you could use a value of "key('colorNumKey',@color)/@PMSnum" for the xsl:value element's select attribute. Because the entire color node was used in the example above, its character data contents (the part between the color start- and end-tags) got added to the result tree.
Let's experiment with this color lookup table a little more. The following template demonstrates several things you can do with declared keys in XSLT using the same shirts source document as the last example.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="text"/> <xsl:key name="colorNumKey" match="color" use="@cid"/> <xsl:key name="colorKey" match="color" use="."/> <xsl:variable name="testVar">c4</xsl:variable> <xsl:variable name="keyName">colorKey</xsl:variable> <xsl:template match="colors"> Looking up the color name with the color ID: c3's color: <xsl:value-of select="key('colorNumKey','c3')"/> c4's color: <xsl:value-of select="key('colorNumKey',$testVar)"/> c8's color: <xsl:value-of select="key('colorNumKey','c8')"/> c7's colors: <xsl:for-each select="key('colorNumKey','c7')"> <xsl:value-of select="."/><xsl:text> </xsl:text> </xsl:for-each> Looking up the color ID with the color name: blue's cid: <xsl:value-of select="key('colorKey','blue')/@cid"/> black's cid: <xsl:value-of select="key($keyName,'black')/@cid"/> gray's cid: <xsl:value-of select="key('colorKey','gray')/@cid"/> </xsl:template> <!-- Don't bother outputting shirt contents for this example. --> <xsl:template match="shirt"/> </xsl:stylesheet>
Before discussing what it does, let's look at the result it creates.
Looking up the color name with the color ID: c3's color: red c4's color: blue c8's color: c7's colors: orange green Looking up the color ID with the color name: blue's cid: c4 black's cid: c2 gray's cid:
The first three xsl:value-of instructions use the same "colorNumKey" key that the previous example did. The first xsl:-value-of instruction passes the literal string "c3" as the index value to look up, and the result shows that "c3" is the key for the color "red". The second shows how a variable can be used for this argument to the key() function: an xsl:variable instruction near the beginning of the stylesheet declares a testVar variable with a value of "c4", and when the XSLT processor uses this variable to look up a color name, the result shows that this finds the color "blue".
The third xsl:value-of instruction in the stylesheet passes the string "c8" to use for the lookup, and there is no color element with a cid attribute value of "c8", so nothing shows up in the result tree after "c8's color:".
The next part of the template looks up the value "c7". The document has two color elements with a cid value of "c7", so the template uses an xsl:for-each instruction instead of an xsl:value-of one to add both of them to the result tree. (If it had used xsl:value-of, only the first would have appeared in the result.) A key() function can return multiple nodes, and this one does, so the xsl:for-each instruction iterates through the "c7" nodes, printing the value and a space (using an xsl:text element for the latter) for each.
Also in Transforming XML |
|
The beginning of this stylesheet declares two keys: "colorNumKey" is the same one we saw in the previous stylesheet, and "colorKey" is the one used by the remaining xsl:variable instructions in this new stylesheet. Its use attribute names the color elements' contents (".") as the lookup index, and each of the three xsl:value-of elements pass this key a color name to look up the node instead of passing a string to match against the color elements' cid values. The entire color node still gets returned, and these three xsl:value-of elements each pull cid attribute value out of this node by adding a slash and "@cid" to make a second location step for the XPath expression in each xsl:value-of element's select attribute.
So, instead of passing a color ID value to get a color name, these last three lookups are each passing a color name to get a color ID. They're looking up the same type of node in the same set of nodes using a different part of those nodes as the lookup index. Getting back to the table analogy, it's like looking up rows in the same table that we used before but using a different column as the key field.
The first of these last three lookups passes the string "blue", and the XSLT processor adds "c4" as the corresponding color ID to the result tree. The second passes the string "black", but unlike any of the lookups before, this one identifies the key name by using a variable instead of a hardcoded string: $keyName, which was set to "colorKey" near the beginning of the stylesheet. This causes no problems, and the "c2" color ID corresponding to "black" gets added to the result tree.
The last key() function call tries to look up the color name "gray", and there is none in the key. The function returns nothing, and nothing gets added after the text node "gray's cid" in the result tree.
The lookup keys don't have to be in the same document as the elements that trigger the lookup. If the example document's colors element had been in a separate document in a separate file, you could still declare its contents as a key and use it for looking up the shirt colors in this document. This ability to look something up in an external data source lets you develop some very powerful document processing systems. In the next "Transforming XML" column, we'll see how to read in multiple documents and, among other things, use one for lookups like these. (If you're in a real hurry to find out how, see my book XSLT Quickly, from which these columns are excerpted.)