Transforming XML Schemas
January 15, 2003
A W3C XML Schema (WXS) document contains valuable information that can be used throughout a system or application, but the complexity that WXS allows can make this difficult in practice. XSLT, however, can concisely and efficiently manipulate WXS documents in order to perform a number of tasks, including creating HTML input forms, generating query interfaces, documenting data structures and interfaces, and controlling a variety of user interface elements.
As an example, this article describes an XSLT document which creates an XHTML form based on the WXS definition of a complex element. For brevity and clarity, the article omits several WXS and XHTML form aspects, including attribute definitions, keys, imported/included schemas, and qualified name issues. How these additional features are implemented can depend greatly on your use of WXS and on your application. However, building a stylesheet that handles every possible WXS feature can be quite an effort and may often be unnecessary.
Much of the information -- occurrence constraints, data types, special restrictions,
and
enumerations -- needed to build an XHTML form is already contained in a WXS document.
Missing bits such as label text and write restrictions can be added into WXS's
<annotation>
element.
The stylesheet will perform four distinct tasks:
- Find the definition of the target complex element that we want.
- Build a form element for the target element.
- Find the definitions of the target element's valid children.
- Build an input element for each of the simple child elements.
In order to do this, the stylesheet will apply different template rules to similar
WXS
elements depending on the task at hand. To make this possible, the stylesheet will
use a
separate mode for each task. The modes are
default, targetElementForm
, findChildNodes
, and
childNodeInput
. Using a separate mode for each task will also help when
modifying, expanding, or deciphering this stylesheet.
Seek and Transform I
When looking someone up in the phonebook, unique names are very helpful: "Smith, Julian Forsythe" is better than "Smith, J". Similarly we need to name the target element uniquely. This can be tricky because an element name is not necessarily unique; multiple locally defined elements or attributes can share the same name. So for inspiration, we can look to NUNs (Normalized Universal Names), which are path statements used to uniquely identify components of a schema. NUNs made their first appearance in the Schema Formal Description Working Draft. The W3C recently issued a formal draft of NUNs under the new name Schema Component Descriptors (SCD).
For this exercise we will use a simplified form of NUN based on element names to
identify
the target element and pass it into the stylesheet as a parameter. So, for example, the
NUN for the definition of the global <actionItem>
element would be
"element::actionItem", and the NUN for an element <employee>
, local to
the global <workGroup>
element, would be
"element::workGroup/element::employee".
This article will preface the description of a set of template rules with a model
that is
based on UML Activity
Diagrams. In these diagrams template rules are shown as states; modes are shown as
composite
states; <apply-templates>
actions are shown as arrows; branching by
template rule match patterns are shown as hollow circles; and other logical branches
and the
XSLT entry point are shown as solid circles.
The first template to fire, match='xs:schema'
, will extract the first element
name from the NUN string. The first element name in the NUN will always be a global
element,
therefore its definition will be a child of the <xs:schema>
element.
<xsl:param name="targetNUN"/> <xsl:template match="xs:schema"> <xsl:variable name="NUNModified" select="concat($targetNUN,'/')"/> <xsl:variable name="NUNToken" select="substring-after(substring-before($NUNModified,'/'),'element::')"/> <xsl:apply-templates select="xs:element[@name=$NUNToken]"> <xsl:with-param name="NUNRemain" select="substring-after($NUNModified,'/')"/> </xsl:apply-templates> </xsl:template>
The other templates in the mode will then recursively parse each element name in the NUN, drilling down through the nested local element definitions until all names from the NUN have been parsed and the target element has been found.
<xsl:template match="xs:element[@name]"> <xsl:param name="NUNRemain"/> <xsl:choose> <xsl:when test="string-length($NUNRemain)=0"> <xsl:apply-templates select="." mode="targetElementForm"/> </xsl:when> <xsl:otherwise> <xsl:apply-templates select="xs:complexType/*"> <xsl:with-param name="NUNRemainder" select="substring-after($NUNRemain,'/')"/> <xsl:with-param name="NUNToken" select="substring-after(substring-before($NUNRemain,'/') ,'element::')"/> </xsl:apply-templates> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="xs:sequence|xs:choice|xs:all"> <xsl:param name="NUNRemain"/> <xsl:param name="NUNToken"/> <xsl:apply-templates select="xs:sequence|xs:choice|xs:all|xs:element[@name=$NUNToken]"> <xsl:with-param name="NUNRemain" select="$NUNRemain"/> <xsl:with-param name="NUNToken" select="$NUNToken"/> </xsl:apply-templates> </xsl:template> <xsl:template match="*"/>
Powers of Annotation
Now that the stylesheet has found the target element and has begun creating a form,
it
needs an informative title. WXS's <annotation>
element lets us associate human-friendly
information and almost any kind of application specific information we want in the
schema. A
detailed description of the data contained in a simple element can be placed into
<documentation>
. For our form, we can also use the
<appinfo>
element to indicate whether an element is read-only and how
it should be labeled.
The targetElementForm
mode contains a single template rule, which builds the
form container. It then enters the findChildNodes
mode by selecting the anonymous or named complex type
definitions for the element.
<xsl:template match="xs:element[@name]" mode="targetElementForm"> <b>New <xsl:value-of select="xs:annotation/xs:appinfo/frm:label"/></b><br/> <xsl:copy-of select="xs:annotation/xs:documentation/node()"/> <form> <input type="hidden" name="%%elementNUN" value="{/xs:schema/@targetNamespace}#{$targetNUN}"/> <table> <xsl:apply-templates select="*|/xs:schema/xs:complexType[@name=current()/@type]" mode="findChildNodes"/> <tr><td colspan="2"> <input type="submit" value="Save Changes"/> </td></tr> </table> </form> </xsl:template> <xsl:template match="*" mode="targetElement"/>
WXS's <annotation>
element will also be useful later on when building
XHTML input elements for the target element's children.
Seek and Transform II
The next step is to find the definitions for the target element's child elements. Fortunately, with a bit of recursion, XSLT can easily handle the complex content models that WXS makes possible.
The templates of the findChildNodes
mode will recursively walk through the
content model of the target element and apply the template rules to locate the definitions
of its child nodes. These few templates can handle named types, sequences, group references,
and type extensions. Though not shown in these examples, a stylesheet could use these
templates to track
occurrence constraints and determine whether a child element is required or optional.
The first template matches content model elements that the stylesheet will just step through.
<xsl:template match="xs:complexType|xs:complexContent| xs:sequence|xs:all|xs:group[@name]" mode="findChildNodes"> <xsl:apply-templates select="*" mode="findChildNodes"/> </xsl:template>
References to global group and element definitions are resolved by the following
two
templates. Included and imported schemas are not shown in this example. However, one
method
for handling referenced WXS documents is to build a node set of the referenced WXS
documents' <xs:schema>
elements with the help of XSLT's document() function. You can then
use your favorite XSLT 1.0 processor's node-set() extension function (.NET, MSXML, XALAN,
SAXON, EXSL) to
assign the node set to a global variable, which you can use in place of
/xs:schema
in the following templates.
<xsl:template match="xs:group[@ref]" mode="findChildNodes"> <xsl:apply-templates select="/xs:schema/xs:group[@name=current()/@ref]" mode="findChildNodes"/> </xsl:template> <xsl:template match="xs:element[@ref]" mode="findChildNodes"> <xsl:apply-templates select="/xs:schema/xs:element[@name=current()/@ref]" mode="findChildNodes"/> </xsl:template>
When an
<extension>
is encountered, the stylesheet will process the
extension's base and local content models in sequence.
<xsl:template match="xs:extension" mode="findChildNodes"> <xsl:apply-templates select="/xs:schema/xs:complexType[@name=current()/@base]" mode="findChildNodes"/> <xsl:apply-templates select="*" mode="findChildNodes"/> </xsl:template>
Once the definition of the child element is found, the template determines whether
it is a
simple type and exempt from any application specific restrictions. If this is true,
it
builds the table row, and then applies the templates from the childNodeInput
mode to construct the input element.
<xsl:template match="xs:element[@name]" mode="findChildNodes"> <xsl:if test="not(xs:complexType| /xs:schema/xs:complexType[@name=current()/@type]| xs:annotation/xs:appinfo/frm:readonly)"> <tr> <td> <xsl:choose> <xsl:when test="xs:annotation/xs:appinfo/frm:label"> <xsl:value-of select="xs:annotation/xs:appinfo/frm:label"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="@name"/> </xsl:otherwise> </xsl:choose> </td> <td> <xsl:apply-templates select="." mode="childNodeInput"> <xsl:with-param name="nodeName" select="@name"/> </xsl:apply-templates> </td> </tr> </xsl:if> </xsl:template> <xsl:template match="*" mode="findChildNodes"/>
Interpreting Simple Type Definitions
The templates of the simpleInputElement
mode will walk the schema to find the
base type of each simple element and then output the appropriate XHTML form element
for the
type. If the element's type is a restriction of a base type, it will further modify
the
XHTML form element with XHTML or custom attributes.
The first two template rules are designed to match elements typed with anonymous or namespace defined simple types. The priority attribute of the first template is set to zero, so that it will have a lower priority than the template rules matching native WXS types.
<xsl:template match="xs:element[@type]" mode="childNodeInput" priority="0"> <xsl:param name="nodeName"/> <xsl:param name="nodeValue" select="@default"/> <xsl:apply-templates select="/xs:schema/xs:simpleType[@name=current()/@type]/xs:restriction" mode="childNodeInput"> <xsl:with-param name="nodeName" select="$nodeName"/> <xsl:with-param name="nodeValue" select="$nodeValue"/> </xsl:apply-templates> </xsl:template> <xsl:template match="xs:element[xs:simpleType]" mode="childNodeInput"> <xsl:param name="nodeName"/> <xsl:param name="nodeValue" select="@default"/> <xsl:apply-templates select="xs:simpleType/xs:restriction" mode="childNodeInput"> <xsl:with-param name="nodeName" select="$nodeName"/> <xsl:with-param name="nodeValue" select="$nodeValue"/> </xsl:apply-templates> </xsl:template>
The following template rules are a sample of the templates to match the native WXS types.
<xsl:template match="xs:element[@type='xs:string']|xs:restriction[@base='xs:string']" mode="childNodeInput"> <xsl:param name="nodeName"/> <xsl:param name="nodeValue" select="@default"/> <xsl:choose> <xsl:when test="xs:maxLength"> <input type="text" name="{$nodeName}" value="{$nodeValue}"> <xsl:apply-templates select="*" mode="childNodeInput"/> </input> </xsl:when> <xsl:otherwise> <textArea name="{$nodeName}"> <xsl:apply-templates select="*" mode="childNodeInput"/> <xsl:value-of select="$nodeValue"/> </textArea> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="xs:element[@type='xs:boolean']|xs:restriction[@base='xs:boolean']" mode="childNodeInput"> <xsl:param name="nodeName"/> <xsl:param name="nodeValue" select="@default"/> <input type="radio" name="{$nodeName}" value="true"> <xsl:if test="$nodeValue='true'"> <xsl:attribute name="checked">checked</xsl:attribute> </xsl:if> Yes </input> <input type="radio" name="{$nodeName}" value="false"> <xsl:if test="$nodeValue='false'"> <xsl:attribute name="checked">checked</xsl:attribute> </xsl:if> No </input> </xsl:template>
Some restriction facets
map directly to XHTML input element attributes. For example <xs:maxLength
value="10">
maps to maxsize="10"
attribute in an
<input>
element. Other restriction facets such as
<xs:pattern>
and <xs:maxInclusive>
do not map to
XHTML nodes. One approach to ensure conformant input from the user is to add custom
attributes to the XHTML <input>
element, and use a client side script to
validate the user's input using the custom attributes.
<xsl:template match="xs:maxLength" mode="childNodeInput"> <xsl:attribute name="maxLength"> <xsl:value-of select="@value"/> </xsl:attribute> </xsl:template> <xsl:template match="xs:pattern" mode="childNodeInput"> <xsl:attribute name="validationRegExp"> <xsl:value-of select="@value"/> </xsl:attribute> </xsl:template>
This stylesheet will map restrictions that have enumeration facets, regardless of the
base type, to a <select>
XHTML element. WXS annotation elements can
useful here, when we need to associate human-readable text with an enumeration facet:
<xs:enumeration value="AK"> <xs:annotation> <xs:documentation>Alaska</xs:documentation> </xs:annotation> </xs:enumeration>
<option value="AK">Alaska</option>
The template rule that matches each enumeration facet checks for annotation information as well as whether the enumeration is the default value.
<xsl:template match="xs:restriction[xs:enumeration]" mode="childNodeInput" priority="1"> <xsl:param name="nodeName"/> <xsl:param name="nodeValue"/> <select name="{$nodeName}"> <xsl:apply-templates select="*" mode="childNodeInput"> <xsl:with-param name="nodeValue" select="$nodeValue"/> </xsl:apply-templates> </select> </xsl:template> <xsl:template match="xs:enumeration" mode="childNodeInput"> <xsl:param name="nodeValue"/> <option value="{@value}"> <xsl:if test="@value=$nodeValue"> <xsl:attribute name="selected"> selected </xsl:attribute> </xsl:if> <xsl:choose> <xsl:when test="xs:annotation/xs:documentation"> <xsl:value-of select="xs:annotation/xs:documentation"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="@value"/> </xsl:otherwise> </xsl:choose> </option> </xsl:template> <xsl:template match="*" mode="childNodeInput"/>
Conclusion
Using WXS as the common resource for data typing in your application can have big payoffs. By allowing components and interfaces to automatically reflect changes to an application's data model, you can greatly increase the reusability and flexibility of a system. XSLT is a useful, largely platform-independent, and highly portable tool for making this possible.