XML Namespaces Don't Need URIs
April 13, 2005
The decision to identify XML namespaces with URIs was an architectural mistake that has caused much suffering for XML users and needless complexity for XML tools. Removing namespace URIs altogether and simply using namespace prefixes to identify namespaces would make it easier for people as well as software to read, write, and process XML.
Background
In XML 1.0, element and attribute names were treated as atomic tokens with no interior structure.
Namespaces in XML introduced the concept of element and attribute names existing in namespaces. Namespaces are identified by URIs and bound to namespace prefixes. It is also possible to bind a default namespace to the empty prefix. This namespace will then apply to all elements that have no prefix.
For example, the XSLT elements exist in the
http://www.w3.org/1999/XSL/Transform
namespace, which is traditionally bound
to the xsl
namespace prefix:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="foo" select="1"/> ... </xsl:transform>
A namespace-aware XML processor will internally resolve these element names into tuples containing the namespace URI, the namespace prefix, and the local name:
{ http://www.w3.org/1999/XSL/Transform, xsl, transform } { http://www.w3.org/1999/XSL/Transform, xsl, variable }
The particular namespace prefix used is supposed to be irrelevant, but in practice people agree on common namespace prefixes for clarity, as it would be very confusing if everyone used different ones. Here are some familiar namespace prefixes from the W3C:
Prefix URI xml
http://www.w3.org/XML/1998/namespace
xsl
http://www.w3.org/1999/XSL/Transform
fo
http://www.w3.org/1999/XSL/Format
xsd
http://www.w3.org/2001/XMLSchema
html
http://www.w3.org/1999/xhtml
svg
http://www.w3.org/2000/svg
Looking at this list one might wonder why it should be necessary to specify the namespace URI at all, considering that these namespaces already have a standard prefix that is far more concise and easy to remember. Using URIs to identify namespaces is a problematic approach with many usability flaws, all of which would be solved if namespaces were identified by the namespace prefix instead.
What Is Wrong with Namespace URIs?
Namespace URIs Have Terrible Syntax
As seen in the table above, namespace URIs tend to be long and cryptic, with lots
of
punctuation and case-sensitive text. In this instance the W3C has compounded the problem
by
adding dates to ensure that the namespace URIs are unique, as if it were likely that
the W3C
would create another "XSL/Transform
" or "xhtml
" namespace in the
future.
While namespace URIs may be guaranteed to be unique, they are also guaranteed to be
impossible to remember. Quick, without checking, can you remember if the namespace
URI for
W3C XML Schema ends with "xmlschema
", "XML/Schema
", or
"XMLSchema
"? Was the namespace URI for SVG allocated in 1999, 2000, or
2001?
The opaque nature of these namespace URIs is inconvenient for users, who must begin each new XML document with a ritual of carefully copying and pasting all of the namespace declarations from the last document that they were working on. If the namespace URIs are typed slightly wrong, the XML document will lose its intended meaning and software will fail to process it
HTTP URIs Are a Poor Choice for Namespaces
HTTP URIs are often used as namespace URIs. However, most software treats HTTP URIs as resource locators, not identifiers. For example, the requirement to type namespace URIs exactly as they appear is at odds with the standard practice for HTTP URIs, which usually have many equivalent forms:
http://w3.org/1999/XSL/Transform http://www.w3c.org/1999/XSL/Transform http://www.w3.org/1999/XSL//Transform http://www.w3.org:80/1999/XSL/Transform http://www.w3.org/1999/XSL/Transform
All of these HTTP URIs will return the same web page if entered into a browser, but only the last one is the correct namespace URI for XSLT. This clashes with user expectations, to put it mildly. The one potential advantage of using HTTP URIs would be that they could act as links to useful resources, but in practice most people don't bother doing this. This disinterest is most strikingly observed with the XSLT and XSL-FO namespaces, which point to brief documents saying "Someday a schema for XSL Transforms will live here" and "This is another XSL namespace" respectively.
There was an effort to develop RDDL (Resource Directory Description Language) expressly for creating documents to sit at the end of HTTP namespace URIs and direct XML tools to associated resources such as style sheets, schemas, and documentation. It is not used by any tools on the Web and with good reason: there are better ways to associate resources with individual XML documents.
Aside: Why were URIs chosen over better alternatives?
It is not difficult to construct a better syntax than HTTP URIs for unique identifiers. A good existing example is the syntax used to identify Java packages:
org.w3.xsl.transform
Look at the difference. The identifier is all lowercase to make it easier to remember,
the
redundant http://www.
that wastes the first 11 characters of so many namespace
URIs is gone, as are all the slashes.
Given that Java predated the XML Namespaces specification, one can only assume that URIs were chosen to identify namespaces for reasons other than syntactical convenience, such as their intended use in the RDF/XML syntax.
Namespace URIs Don't Help People Read or Write XML
Namespace URIs give people the ability to write XML documents with arbitrary prefixes
like
<foo:schema>
or <superluminal:transform>
, but people
don't do that, sensibly enough, as it would be confusing and serve no purpose. Since
people
are already identifying namespaces using sensible namespace prefixes, having to write
the
namespace URIs as well is just a hindrance.
Namespace URIs don't help people to read XML documents either. They add an unnecessary
level of indirection that makes XML documents harder to interpret, as looking at an
element
name is no longer enough to tell you exactly what that element is. When you read an
XML
document beginning with <html>
, or <svg>
, or
<xsl:transform>
, or <xsd:schema>
, should it really
be necessary to carefully check that the namespace prefix is bound to the correct
namespace
URI?
Since namespace URIs don't help people to read or write XML documents, why should XML tools complain if they are omitted? Namespace URIs do not fit in with the goals of XML, which has been designed to be produced and/or consumed by people as well as software.
Could Namespaces Work Without URIs?
If namespace URIs were removed and namespaces were identified solely by namespace prefixes instead, namespaces would still make sense and existing XML specifications would only require minor alterations.
XSLT Without Namespace URIs
XSLT is one of the few XML languages that actually relies on namespaces for disambiguation, specifically to distinguish XSLT elements that are processed specially from other elements, which are output verbatim. XSLT also has the requirement to perform namespace rewriting in order to be able to output elements that are in the XSLT namespace without actively processing them, similar to quoting or escaping in other programming languages.
However, XSLT has no need for namespace URIs. An XSLT processor could instead treat
any
element with an xsl
prefix as being in the XSLT namespace and process it
accordingly. Elements with a different prefix or no prefix would be output verbatim
in the
usual manner, and namespace prefix rewriting would also take place as normal using
the
existing XSLT aliasing mechanism:
<xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>
Removing all of the namespace URIs from an XSLT transform will make it easier to read and write but will not affect the way it is processed, so why require namespace URIs for XSLT?
XHTML Without Namespace URIs
XHTML documents rarely use namespace prefixes, as many web browsers are not XML-aware
and
do not expect to see them. In any case, a root element of <html>
should
be sufficient to identify an XHTML document; there is no pressing need to add the
namespace
URI as well. Current W3C practice encourages XHTML documents to accumulate the namespace
URIs for XHTML, SVG, MathML, XForms, XML Schema, XML Events, and who knows what else.
All of
these have simple prefixes that are sufficient to identify the namespace in question,
so
there is no reason to place this burden on users. XHTML does not need namespace URIs.
RDF/XML Without Namespace URIs
RDF/XML, the XML syntax for RDF that seems to have been the driving force for the adoption of namespace URIs, does not need namespace URIs. Or to be more accurate, it would be trivial to define a method of binding URIs to namespace prefixes specifically for RDF/XML, without forcing it to be a standard that applied to all XML documents. Given that RDF/XML is not an ideal syntax for representing RDF anyway—there exist numerous superior alternatives—-it is unfortunate that it has imposed such a clumsy namespace mechanism on the wider XML community.
The Default Namespace
There are some occasions such as modular XHTML, where people may wish to write elements
without namespace prefixes that are nonetheless in a namespace. This could be done
with an
attribute like xmlns
; let's call it xml:ns
, just for fun:
<blockquote xml:ns="html"> ... </blockquote>
An explicit namespace prefix is probably a better choice though, as it makes each element stand alone, with a fixed meaning that cannot be changed at the whim of its ancestors.
QNames in Text Content
One of the uglier architectural warts that namespaces has introduced to XML is the use of qualified names in text content:
<foo:message status="foo:severe" ...
The problem, of course, is that according to the current specification of XML Namespaces, namespace prefixes are supposed to be irrelevant and may be changed without altering the meaning of the document. Unless it uses namespace prefixes in text content, in which case the namespace prefixes become very significant indeed. Why not just drop the URIs, admit that the namespace prefixes are significant, and end the whole pointless charade?
Further Reading
|
Conclusion
For XML, it may already be too late to remove namespace URIs. While the XML specification itself does not depend on them, enough implementations already do so that it may be impractical to effect a change. However, there are still steps that can be taken when designing XML vocabularies to minimize the problems that namespace URIs cause:
-
Carefully consider whether namespaces are really necessary. Many XML vocabularies don't need them, so don't feel compelled to use them without good reason.
-
If namespaces are necessary, choose namespace URIs that are concise and easy to remember. It helps if they are all lowercase and don't include unnecessary information.
-
Try to get by with only one namespace if you can. There is not much to gain by multiplying namespaces unnecessarily except trouble and complexity. If you must use more than one namespace, at least ensure that the namespace URIs follow a consistent pattern.
-
Agree on standard namespace prefixes for your XML vocabularies; they will help people to read and write your XML without confusion. If you find yourself using the default namespace rather than the prefixes, consider whether you actually need a namespace at all.
Following these steps will help to keep namespace URIs under control in your XML documents.