Overriding Concerns
November 26, 2003
Q: How do I merge two XML source trees into one?
I've tried so many things that it's driving me crazy: I want to merge or join two XML files. Which two files are to be merged are specified in a third file (call it merge.xml). This third file looks like this:
<?xml version="1.0"?> <merge> <appxml>testapp.xml</appxml> <userxml>user.xml</userxml> </merge>
And here are the two files to which merge.xml refers. First, testapp.xml:
<app name="testapp" lifetime="900"> <mainmenu> <menu id="1" caption="test"/> <menu id="2" caption="another test"/> </mainmenu> <forms> <testform autosize="1"/> <testform2 autosize="0"/> </forms> </app>
And here is user.xml:
<app lifetime="100"> <mainmenu> <menu id="2" caption="my test"/> <menu id="3" caption="my menu"/> </mainmenu> <forms> <testform2 autosize="1"/> </forms> </app>
The result must be:
<app name="testapp" lifetime="100"> <mainmenu> <menu id="1" caption="test"/> <menu id="2" caption="my test"/> <menu id="3" caption="my menu"/> </mainmenu> <forms> <testform autosize="1"/> <testform2 autosize="1"/> </forms> </app>
I'm using merge.xml as the source tree for the transformation. Basically I use the
string-values of the appxml
and userxml
elements as input to the
document()
function, like this:
<xsl:template match="merge" > <xsl:variable name="app_xml" select="string(appxml)" /> <xsl:variable name="user_xml" select="string(userxml)" /> <xsl:call-template name="domerge"> <xsl:with-param name="app_nodes" select="document($app_xml)" /> <xsl:with-param name="user_nodes" select="document($user_xml)" /> </xsl:call-template> </xsl:template>
As you can see, I want to process the nodes with a named template called
domerge
. But what the heck should domerge
contain?
A: Although you didn't say as much in your question, obviously the nature of the basic problem is how to use the user.xml file's contents to override in the result tree those of the testapp.xml file. That is, testapp.xml establishes rules for how some application is to behave, and user.xml permits some or all of those rules to be overridden.
This is one of my favorite uses for simple XML. You've got an additional twist -- the third file, which specifies which files to use -- but the basic approach is the same.
To start with, here's a basic domerge
named template:
<xsl:template name="domerge"> <xsl:param name="app_nodes" /> <xsl:param name="user_nodes" /> <app> <mainmenu> <xsl:copy-of select="$app_nodes//menu" /> <xsl:copy-of select="$user_nodes//menu" /> </mainmenu> <forms> <xsl:copy-of select="$app_nodes//forms/*" /> <xsl:copy-of select="$user_nodes//forms/*" /> </forms> </app> </xsl:template>
This doesn't do everything you need, but it gets you part of the way there. It begins
by
declaring the two parameters app_nodes
and user_nodes
, whose
values you're supplying in the xsl:call-template
element you've already
constructed. Then it establishes the basic structure of the result tree -- an
app
root element, with mainmenu
and forms
child
elements. Within each of those two children, it instantiates copies of the corresponding
portions of testapp.xml and user.xml. The result tree from the stylesheet looks like
this,
so far:
<app> <mainmenu> <menu id="1" caption="test"/> <menu id="2" caption="another test"/> <menu id="2" caption="my test"/> <menu id="3" caption="my menu"/> </mainmenu> <forms> <testform autosize="1"/> <testform2 autosize="0"/> <testform2 autosize="1"/> </forms> </app>
The problems with this result tree are two-fold. First, it doesn't yet include any
attributes for the app
element. Second, the elements from testapp.xml which are
overridden by user.xml -- these overridden elements are boldfaced above -- shouldn't
be
appearing at all.
Let's start with those attributes for the app
element, name
and
lifetime
. (Your sample code doesn't indicate that name
can be
overridden in user.xml, but I assume it can.) What you need to do is build each attribute
using the ones in testapp.xml unless the same attribute appears in user.xml. Here's
one approach, with the new code highlighted in boldface:
<app> <xsl:attribute name="name"> <xsl:choose> <xsl:when test="$user_nodes/app/@name"><xsl:value-of select="$user_nodes/app/@name"/></xsl:when> <xsl:otherwise><xsl:value-of select="$app_nodes/app/@name"/></xsl:otherwise> </xsl:choose> </xsl:attribute> <xsl:attribute name="lifetime"> <xsl:choose> <xsl:when test="$user_nodes/app/@lifetime"><xsl:value-of select="$user_nodes/app/@lifetime"/></xsl:when> <xsl:otherwise><xsl:value-of select="$app_nodes/app/@lifetime"/></xsl:otherwise> </xsl:choose> </xsl:attribute> [etc. as above] </app>
What I've added here is a pair of xsl:attribute
elements, which instantiate in
the result tree the name
and lifetime
attributes and, then, assign
their values (using an xsl:choose
block for each attribute) depending on
whether or not those attributes have been assigned values in user.xml. As desired,
the start
tag of the result tree's app
element now looks like this:
<app name="testapp" lifetime="100">
Fixing the other problem with the named template so far -- that the elements from
testapp.xml which were overridden by user.xml are still showing up in the result tree
--
will be a little trickier. The problem is that the plain-old xsl:copy-of
elements are too indiscriminate. For processing the menu
elements, you can do
something like this:
<app> [etc. as above] <mainmenu> <xsl:for-each select="$app_nodes//menu"> <xsl:choose> <xsl:when test="$user_nodes//menu/@id[.=current()/@id]"/> <xsl:otherwise><xsl:copy-of select="." /></xsl:otherwise> </xsl:choose> </xsl:for-each> <xsl:copy-of select="$user_nodes//menu" /> </mainmenu> [etc. as above] </app>
Also in XML Q&A |
|
Here the xsl:copy-of
for all the testapp.xml menu
elements has
been replaced by an xsl:for-each
which examines each of those menu
elements in turn. If the current menu
element's id
attribute is
matched by one in the user.xml tree, the template does nothing; otherwise, it instantiates
in the result (via a simplified xsl:copy-of
) a copy of the current
menu
element (again, from the testapp.xml file). Note also that the
xsl:copy
for the menu
elements in user.xml hasn't been changed
at all; all of those menu
elements go straight into the result.
Handling the form
elements -- whether overridden by user.xml or not -- is
similar to the solution for the menu
elements. (There aren't really any
form
elements as such in either testapp.xml or user.xml; instead, there are
testform
, testform1
, etc. elements. I'm referring to them as
form
elements just as a sort of collective shorthand.) But menu
elements were "matched" (or not) between the two input files by way of their id
attributes' values; the key to matching form
elements is by their element
names. For example, testapp.xml has testform
and
testform2
elements; user.xml, only a testform2
. So the structure
for processing these elements is similar to that for processing the menu
elements, but with a different test
attribute in the xsl:when
element:
<app> [etc. as above] <forms> <xsl:for-each select="$app_nodes//forms/*"> <xsl:choose> <xsl:when test="$user_nodes//forms/*[name()=name(current())]"/> <xsl:otherwise><xsl:copy-of select="." /></xsl:otherwise> </xsl:choose> </xsl:for-each> <xsl:copy-of select="$user_nodes//forms/*" /> </forms> </app>
As before, the assumption is that all form
elements from user.xml get
transcribed straight to the result tree; it's only the testapp.xml form
s which
need to be tested for inclusion.
One additional note about the xsl:when
test
attributes for the menu
and form
elements: they
both use the current()
function to refer to the node in testapp.xml currently
being processed by their respective xsl:for-each
loops. Inside an XPath
expression predicate, this is often necessary; the context node at such a point is
often
different from the current node. For example, the context node in either of these
two
predicates is a matching element from the user.xml file, not the respective element
from testapp.xml -- and it's the latter which need to be tested before copying to
the result
tree.
For reference, here's the final domerge
named template:
<xsl:template name="domerge"> <xsl:param name="app_nodes" /> <xsl:param name="user_nodes" /> <app> <xsl:attribute name="name"> <xsl:choose> <xsl:when test="$user_nodes/app/@name"><xsl:value-of select="$user_nodes/app/@name"/></xsl:when> <xsl:otherwise><xsl:value-of select="$app_nodes/app/@name"/></xsl:otherwise> </xsl:choose> </xsl:attribute> <xsl:attribute name="lifetime"> <xsl:choose> <xsl:when test="$user_nodes/app/@lifetime"><xsl:value-of select="$user_nodes/app/@lifetime"/></xsl:when> <xsl:otherwise><xsl:value-of select="$app_nodes/app/@lifetime"/></xsl:otherwise> </xsl:choose> </xsl:attribute> <mainmenu> <xsl:for-each select="$app_nodes//menu"> <xsl:choose> <xsl:when test="$user_nodes//menu/@id[.=current()/@id]"/> <xsl:otherwise><xsl:copy-of select="." /></xsl:otherwise> </xsl:choose> </xsl:for-each> <xsl:copy-of select="$user_nodes//menu" /> </mainmenu> <forms> <xsl:for-each select="$app_nodes//forms/*"> <xsl:choose> <xsl:when test="$user_nodes//forms/*[name()=name(current())]"/> <xsl:otherwise><xsl:copy-of select="." /></xsl:otherwise> </xsl:choose> </xsl:for-each> <xsl:copy-of select="$user_nodes//forms/*" /> </forms> </app> </xsl:template>
The result tree matches your desired output.