Making Links, Breaking Entities
February 27, 2002
Q: Making a link
I'm trying to transform this XML
<link value="business.html" anchor="#top">Businesses - Click
Here</link>
into the obvious HTML:
<a href="business.html#top">Click Here</a>
I can get the anchor wrapped around the "Click Here" text well enough and create the
href
attribute with empty double quotes, using the
xsl:attribute-set
element. But I don't know how to transform the XML
attributes to create the URL from the source XML without simply hard-coding it over
and over
again.
A: As you probably suspect, you're pretty close to the answer already. Here's one
solution,
using the xsl:attribute
element:
<xsl:template match="link">
<a><xsl:attribute name="href">
<xsl:value-of
select="@value/>#<xsl:value-of select="@anchor"/>
</xsl:attribute>Click Here</a>
</xsl:template>
And here's another, using attribute value templates (or AVTs, in boldface):
<xsl:template match="link">
<a href="{@value}#{@anchor}">Click Here</a>
</xsl:template>
(A reminder: an AVT, coded as an XPath expression enclosed in "curly braces" --
{
and }
characters -- takes content directly from the source
tree, particularly attribute values, and plugs it directly into the result without
the
somewhat clunkier intermediate step of using xsl:attribute
, etc.)
Of the two approaches, I prefer the latter. For one thing, it's more concise (though
arguably more cryptic); it also avoids potential problems with some XSLT processors,
having
to do with the embedding of whitespace -- particularly newlines -- within an attribute
value. (You could also avoid these problems by removing the newlines from the
xsl:attribute
element above, at the expense of readability.)
I'm not sure what led you to believe you needed the xsl:attribute-set
element.
That's useful when you need to establish a group of attributes which will be used
repeatedly
throughout your XSLT style sheet:
<xsl:attribute-set name="xlink_stuff">
<xsl:attribute
name="xlink:type">simple</xsl:attribute>
<xsl:attribute name="xlink:title">A simple link</xsl:attribute>
</xsl:attribute-set>
Then, when you need to "clone" this group of attributes for a given element in the
result
tree, just use the use-attribute-set
attribute to the xsl:element
element:
<xsl:element name="a" use-attribute-sets="xlink_stuff"
href="mylink.html" />
This creates an a
element with the href
attribute as indicated,
but also with the two attributes (xlink:type
and xlink:title
)
whose values are hard-wired by way of the xsl:attribute-set
element. As you
found, it's not easy to create an xsl:attribute-set
element with content which
varies from one portion of the result tree to another. (In fact, it's impossible:
xsl:attribute-set
is a top-level XSLT element. Among other things, this means
that its contents can't use source-tree content, such as -- in this case -- your
link
element.)
As an aside, the code you posted on the XML.com forum used all-uppercase HTML element names. I strongly advise getting used to all-lowercase instead (which is why I changed it in the code samples above); this will help, in a small way, to shepherd you into the brave new world of XHTML.
Q: Declaring entities with XML Schema?
I was trying to find a way to declare entities in a W3C [XML] Schema (&
,
<
, but also ê
, ë
etc) so they can be
implemented in the final XML document, and then transformed with an XSLT style sheet
to the
correct HTML equivalents. Now of course I can declare one or multiple elements in
my XSD,
but there must be another way. (John here showed me, but with DTD's, and I don't want
to use
them.)
A: Welcome to an ugly truth about the W3C's XML Schema language, freely (or otherwise) conceded by supporters as well as detractors. There is currently no way to declare entities with XML Schema.
The first time most people encounter this reality, it comes as something of a shock. Isn't XML Schema supposed to be "DTDs on steroids"? Well, yes. XML Schema can do loads of things which DTDs cannot. But DTDs can do one thing which XML Schema can't even touch, namely, declare entities.
The reasons for this are pretty straightforward if you think about entities and entity
references in the right way. A general or character entity reference of the kind you're
asking about -- like &
for the ampersand, or ©
for the copyright symbol, or &ora;
for the string "O'Reilly and
Associates" -- is simply a convenience. It's a way of representing in an XML document
something which otherwise has special meaning to an XML parser (like the ampersand),
or is unavailable on an input device (e.g., a keyboard), or in the chosen character
encoding for the document (like the copyright symbol), or is simply too verbose and
difficult to maintain in its fully expanded form. Thus, an entity reference works
something
like a word-processor macro -- a hot key, if you will, which has effects (that is,
behaves)
in certain predefined ways. And as with a word-processor macro, the entity reference's
behavior exists on a conceptual plane outside the scope of the document's
contents.
Note that I'm talking here about entity references, not the text (in this case) to which they refer. This substitution text is very much a part of the document's content, but it "exists" -- and hence can be subject to a schema's structural constraints, or even a DTD's content model -- only after the substitution has taken place. The entity reference exists in the document prior to parsing and substitution; following parsing and substitution, the reference has, as it were, simply evaporated.
Also in XML Q&A |
|
Take a look at the XML Recommendation, Sections 3 (Logical Structures) and 4 (Physical Structures). Notice anything unusual? All the familiar pieces of content models and attributes -- those things which both DTDs and XML Schema address -- fall into the category of logical structures. The physical-structures section is devoted entirely to entity and notation declarations: that is, syntactic or lexical constructs which in themselves do not comprise logical chunks of an XML document and are hence invisible to XML Schema processing.
By the way, an XSD is itself an XML document, of course, so there's nothing preventing you from using entities within the Schema itself. (This is a little perverse, requiring the Schema to use a DTD to declare those entities.) You just can't use XML Schema to declare entities for use in other documents. (Appendix C, "Using Entities," of the XML Schema "Part 0: Primer" Recommendation, describes what seems to me a twisted approximation of entity declaration using XML Schema. Feel free to refer to this Frankenstein's monster of a "solution" if you're determined to go the Schema route but still want to declare "entities" -- really, in this case, just elements with fixed content.)
It may be small consolation, but consider this: as bad as your problem (wanting to construct character entities) seems, the poor folks who want to use more exotic kinds of entities (as with notations) are really cast adrift by XML Schema. It will be tedious for you to include the literal substitution text in your Schema-validated documents, instead of entity references. But at least you can get that substitution text into the final document somehow. There's no counterpart in XML Schema at all for referring to non-XML content like notations.