XML Namespaces 1.1
April 10, 2002
Namespaces have probably generated more debate and confusion than any other W3C Recommendation. With the first publication of a Working Draft for Namespace 1.1, which has caused a great surge of discussion on XML-DEV, it seems like the next iteration of the specification won't be any less controversial.
The New Draft
Looking through the color coded changes in the new Namespaces in XML 1.1 Working Draft will quickly demonstrate that the document includes minimal changes from the original. Turning to the accompanying requirements document shows that this is entirely intentional. Barring the addition of some minor errata, Namespaces 1.1 will incorporate only a single extra feature, the ability to undeclare a namespace prefix.
Putting the changes in context, the requirements also explain that the new specification is to be progressed to Recommendation alongside XML 1.1. Richard Tobin clarified the relationship between the two specifications in an XML-DEV posting.
It's intended that Namespaces 1.1 will be strongly tied to XML 1.1, so that it will only apply to XML 1.1 documents. XML 1.0-only parsers will reject documents labeled 1.1 anyway, and Namespaces-1.1-aware processors will apply Namespaces 1.0 to 1.0 documents and Namespaces 1.1 to 1.1 documents.
As Ronald Bourret's XML Namespaces FAQ explains, while it's possible to undeclare the default namespace, it's not possible to undeclare a namespace associated with a prefix. According to the Namespace 1.1 requirements document this is a significant problem that is affecting several other W3C specifications, including XQuery, SOAP, and XInclude.
The precise problem is that, according to the XML Infoset, each element has a number of ' namespace information items', one for each of the namespace declarations currently in-scope for that element. An in-scope namespace is any namespace declared on an individual element or one of it's ancestors. This is problematic because each element ends up carrying around additional 'baggage' in the form of namespace declarations that it doesn't need.
In the case of large documents, e.g. as might be generated by an XQuery on an XML database, these properties could add a large amount of overhead to an Infoset, particularly for deeply hierarchical data. The problem also surfaces in XInclude, which allows fragments of other documents to be included into a single master. These included elements may have a smaller set of in-scope namespaces in the original document than in their new context. Accurately serializing this data is problematic because this Infoset cannot be round-tripped to XML without introducing additional properties in the reparsed Infoset that weren't part of the original.
So this new ability, while perhaps a logical extension of the current facility, is largely driven by a knock-on effect of how the infoset has been designed. The ability to undeclare a namespace doesn't really furnish the document author with any additional capabilities: she can already associate a prefix with a new URI, if desired. A possible unfortunate side-effect of the change may be that round-tripping an XML document might introduce lexical changes -- additional attributes to undeclare namespaces -- just to preserve the data model.
Trying to understand the reasons why the Infoset was designed this way, I asked XML-DEV why an element needs to have namespace information items? Particularly as the Infoset also associates a namespace and prefix with each element information item.
Lifting Stones
Several people offered reasons, including Richard Tobin (co-editor of the Infoset specification), who explained that namespace information items aren't required by the parser. In fact they're required to properly interpret QNames (namespace qualified names) in element and attribute content. Mike Kay also illustrated how problems could occur when moving elements between Infosets if this information isn't available. Kay also acknowledged that allowing QNames in element and attribute values may have been a mistake, but noted that it's one that's too late to change:
...I hear lots of people saying we shouldn't have allowed this, and I'm inclined to agree. But we do allow it, and it works today, and people are taking advantage of the fact that it works today. And I have yet to see a proposal that cleans up the namespace model without breaking applications that work today according to the current specs.
Regular XML-Deviant readers will recall that Kendall Clark summarized a recent debate on this topic in " The Value of Names in Attributes".
Understanding why the Infoset requires this information to be available, we could perhaps boldly restate the requirement to produce XML Namespaces 1.1 as: "tidy up the knock-on effects of allowing QNames in element content and attribute values". Looking at it this way I can't help wondering whether we are going to see other ripple effects as the impact of other design decisions is felt?
Scoping Questions
Mike Kay went a step further and suggested that the ultimate cause of these issues is not recognizing the impact of Namespaces at the architectural level.
The whole problem is caused by doing namespaces as a cheap and cheerful bolt-on to XML without recognizing that it was an architectural change to the foundations. Perhaps it is too late to fix now, but one reason I want to see namespaces move into the core is that I'm fed up with attempts to patch them with sticky-plaster.
This echoes the comments of many of the XML-DEV members during this discussion. A particular irritant has been the stated requirement that the revisions "must be prepared quickly". Given the limited scope of the changes, Elliotte Harold urged the XML Working Group to "take the time to do it right or don't do it at all". Commenting on W3C process in general, Harold also noted that "there's been far too much emphasis on pushing something out of the door instead of pushing the right thing out of the door."
Michael Brennan was of a similar mind, pitching the argument that changes should not only be carefully metered but must also justify themselves by bringing some well-defined benefits.
XML standards define a contract with the entire industry. The impact of rapid, incremental changes to core standards would be disastrous. It would quickly become unmanageable for tool vendors and lead to a proliferation of interoperability problems. Changes to core standards must be done judiciously. And I agree with those who argue that such changes should not be minimalist and should not be done under arbitrary constraints that the changes be "done quickly". If you are going to make change, make it worthwhile to the industry to absorb the impact of that change by taking the time to address the problems in current specs.
"The tiniest tip of the namespaces iceberg", was how Mike Champion chose to describe the proposed revisions. Champion cataloged a list of issues that affect the alternate namespace models present in several different W3C specifications, noting that
Also in XML-Deviant |
|
[t]here's no way to sort this out without breaking lots of code and making people unhappy, but unless XML disentangles itself from this mess, its forward progress will be significantly hindered.
One proposal floated during the discussion combining the XML and Namespaces specifications into a single core document. Opinions were divided on whether this was a good suggestion. Some see XML and Namespaces as the real foundation. Others wanted the continued freedom to ignore Namespaces or perhaps adopt alternatives. Tim Bray's Skunkworks XML 2.0 was cited as an example of how this might combination might be achieved, although this raised some additional concerns as Bray's document also removes DTDs from the core specification. This lead to a subsequent discussion on how DTDs could be updated to better support XML Namespaces. The entire thread is too lengthy to summarize here, but it's interesting to note that the ISO DSDL project has a section devoted to "Namespace-aware processing with DTD syntax" which may yet see this work happen outside of the W3C.
In the short term, and despite loud comments from the community, the scope of changes for XML 1.1 and XML Namespaces 1.1 won't be altered beyond their currently stated requirements as Richard Tobin made absolutely clear. Whether a more substantial reworking to further rationalize these and other specifications will happen is anyone's guess.