Hidden Whitespace, Hidden Meaning
January 30, 2002
Q: Too many newlines?
Every time I update an XML file via an in-house content management system, the processed XML file contains masses of empty lines between the tags. Each time I edit and save the file the gaps get bigger.
Here is the code that I think might contain the problem (XSL):
<xsl:output method="xml" indent="no" encoding="iso-8859-1" />
<xsl:template match="/blocks">
<dl>
<dt>
<font size="2"><xsl:value-of select="welcometitle"/></font>
</dt>
<dt>
<font size="2"><xsl:value-of select="blocktitle"/></font>
</dt>
<dt>
<font size="2">
<xsl:value-of select="acttitle"/></font>
</dt>
<dt>
<font size="2"><xsl:value-of select="resourcetitle"/></font>
</dt>
</dl>
</xsl:template>
A: On first reading, I thought the answer was pretty straightforward -- that the extra
newlines in your output are there because your XSLT style sheet says to include them.
According to this theory, placing newlines in your style sheet's xsl:template
elements causes them to be passed, unchanged, to the result tree.
Unfortunately for the theory, that's not the case. Section 3.4 of the XSLT 1.0 Recommendation addresses how an XSLT processor is to treat whitespace-only text nodes in the style sheet. As this section says, a text node in the style sheet is preserved in the result tree only if at least one of the following conditions is true:
- the text node's parent element is named in an
xsl:preserve-space
element; or - the text node contains at least one non-whitespace character (tab, newline, space); or
- some ancestor of the text node includes an
xml:space="preserve"
attribute, and no closer ancestor has anxml:space="default"
attribute.
Since none of those conditions are true for the whitespace-only text nodes which you've supplied in your code fragment, none of that extra whitespace should be passed to the result tree. You didn't say which XSLT processor you're using. I tested the style sheet against the following source tree:
<blocks>
<welcometitle>Welcome Title</welcometitle>
<blocktitle>Block Title</blocktitle>
<acttitle>Act Title</acttitle>
<resourcetitle>Resource Title</resourcetitle>
</blocks>
Both the Saxon and XT XSLT 1.0 processors indeed strip all the extraneous whitespace
from
the style sheet's xsl:template
element. From one perspective -- the fact that
your result tree is XHTML -- the question is academic, since the effect as browsed is
identical, whether the newlines are present or not. From a strict perspective, though,
something indeed seems off-kilter with the result you're experiencing. My only advice
would
be to try a different XSLT processor (assuming that's an option).
One other note: you may believe that, by specifying indent="no"
in your
xsl:output
element, you've instructed the XSLT processor to suppress
all extra newlines and whitespace. Not so! First, the indent
attribute is used
(when its value is "yes") to direct the processor to supply extra white space in the
result, for "pretty-printing" or similar purposes. The default value is "no," so all
you've
done here is to affirm the default.
But there's another, maybe more important consideration as well. The XSLT spec breaks
down
its various features into one of two categories: required and optional. For an XSLT
processor to be considered compliant with the standard, it must support all required
features... and may support whatever optional ones it likes. As it happens, the
indent
attribute to the xsl:output
element is an optional
feature. So even if you do want to use the indent
attribute, you must be sure
to use a processor which supports it.
[Thanks to Jason Diamond for his input on this question!]
Q: What does this XML mean?
What are "qualifiers" which I see in some of our XML documents and how can we relate it to an Oracle table column? For example, I've seen this:
<OPERAMT qualifier="UNIT" type="T">
<OPERAMT qualifier="EXTENDED" type="T">
Even that type="T"
doesn't make sense....
A: Actually, your question isn't about XML as such. There are no "qualifiers" in XML documents, no OPERAMTs either. XML is just a general-purpose set of rules for defining special-purpose markup languages.
So no, your question is really about a specific XML vocabulary. I don't usually tackle vocabulary-specific questions in this space, but I did do a little research on this one. Apparently the XML vocabulary you're working with is (a variant of?) Enterprise Business XML, or ebXML, for exchanging e-business-related XML-based messages such as sales orders. There's almost too much information about ebXML in general (consult the master site to get a sense of what I mean), but I did locate one on-line resource (101KB PDF; requires Adobe Acrobat Reader) which might be helpful for the immediate question. Once in that document, do a search on OPERAMT to learn the functions of that element and its two attributes.
So there you'll have the answer, sort of , to your question. But I'm afraid it won't
be the
answer you're looking for, which is how to map it to an Oracle table's columns. To
answer
that, you've first got to know which database table (which will depend on your
specific application, of course). And how to get the data from one form (XML or Oracle)
to
the other has little to do with what the qualifier
or type
attributes "mean."
Oracle is a member of the industry consortium supporting ebXML, so you might want to start by consulting the Oracle Web site. For more general information about relating XML documents to database tables, check Ron Bourret's excellent "XML and databases" site.
Q: E-mail links in XML format?
Also in XML Q&A |
|
I have been building a document to handle support calls from various companies. I have created the document that allows the call to be routed to the IT rep within the respective companies. I now wish to create a link for the rep to e-mail the call to the parent company's IT department. How would I create this link in an XML format?
A: On the face of it, this question and the one preceding it are unrelated. But they both stem from a similar misunderstanding -- that data in an XML document "means" anything outside of the context of an application designed to process it. Let me answer the question simply first, and then come back to look at some deeper implications.
Establishing an e-mail link in XML assumes that you're using some kind of software, like a Web browser or e-mail reader, which is smart enough to recognize an e-mail address as an e-mail address. Given that condition, you can choose to put the e-mail address in either an element or an attribute value. For instance (respectively):
<parent_email dept="IT">it@example.com</parent_email>
or:
<parent_co dept="IT" email="it@example.com" />
As you can see, there's no real magic here, no technical wizardry; both of those examples contain an e-mail address in an XML format. The important thing is this: There's no such thing as an "e-mail link" in an XML document. Until and unless you identify a target application which "knows" e-mail, what are to our eyes obviously e-mail addresses remain plain old dumb data. The meaning of a bit of XML code is not inherent in the code; it comes about only by way of human or software interpretation of the code.
Assume your target application is a Web browser. Now you can process the first code fragment above with an XSLT style sheet to do something like the following:
<xsl:template match="parent_email">
E-mail the <a href="mailto:{.}"><xsl:value-of select="@dept"/> Department</a>
</xsl:template>
Or, for processing the second code fragment:
<xsl:template match="parent_co">
E-mail the <a href="mailto:{@email}"><xsl:value-of select="@dept"/> Department</a>
</xsl:template>
This creates in the XHTML result tree, for each occurrence of a source-tree element
whose
name matches the value of the xsl:template
element's match
attribute, an a
element with an href
attribute. For the sample
above, the corresponding portion of the result tree will look like this:
E-mail the <a href="mailto:it@example.com">IT Department</a>
Which, of course, displays (and otherwise behaves) just fine in a Web browser.
Again, though, the point is that to put an e-mail address (or a lobster bisque recipe, or a Valentine's-Day love letter, or anything else) in an "XML format" doesn't do anything on its own. The data will lie there, inert, until a fellow human being -- or a software application -- comes along and recognizes it.