XQuery Questioned
January 2, 2002
Examining the discussion following the publication of several new Working Drafts, the XML-Deviant discovers that the plans of the XQuery Working Group are not meeting developer expectations.
Reading Material
XML developers haven't been short on reading material over the holiday season with the publication of a sheaf of new W3C Working Drafts. The XQuery and XSL Working Groups were the most prolific. Updated versions of the XQuery use cases, the XQuery draft itself, its data model, and functions and operators have all been released. The latter two specifications are also applicable to XPath 2.0 whose initial draft has also just been published
Since XPath 2.0 plays a central role in both efforts, the XQuery and XSL Working Groups are collaborating on its production. The XPath 2.0 draft evidently marks a milestone for the XSL Working Group as they have also published the first draft of XSLT 2.0. XPath 2.0 is now a subset of XQuery as the introduction to that draft explains.
XQuery Version 1.0 contains XPath Version 2.0 as a subset. Any expression that is syntactically valid and executes successfully in both XPath 2.0 and XQuery 1.0 will return the same result in both languages. Since these languages are so closely related, their grammars and language descriptions are generated from a common source to ensure consistency, and the editors of these specifications work together closely.
The XPath 2.0 specification also notes that the two specifications share "much of the same expression syntax and semantics, and much of the text found in the two Working Drafts is identical".
Dependencies
The strong dependencies and substantive overlaps between these specifications was the topic of the last debate on XQuery, when the community discussed whether XQuery was reinventing the wheel. The new drafts caused similar comments on XML-DEV recently. Paul Tchistopolskii once more suggested a refactoring, particularly for XPath, while Christian Nentwich said that the tangle of dependencies makes the content inaccessible.
...how is anyone supposed to teach this stuff to people? It was hard enough to teach XPath 1.0 to non-programmers, but this almost looks like a programming language now.
I find it quite unbelievable how badly the W3C is managing the xquery, xslt and xpath evolution. If someone handed you a piece of code that separates its concerns so badly, you would send them back to refactor it.
For an illustration of Nentwich's comments one need only consider the relationships that XPath 1.0 and 2.0 have with some other specifications: as a subset of XQuery, a basis for XPointer (plus extensions), used in XSLT (plus extensions), and a stream-oriented subset within W3C XML Schema (XSD). Not to mention that XPath 2.0 is itself dependent on XSD. It should be unsurprising, then, that the nagging suspicion that there is a simple XPath core, yet to be clearly delineated, is hard to shake.
Addressing some of these criticisms, Michael Rys contrasted XPath with XQuery and XQuery with XSLT. Michael Kay also offered a summary of the differences between XQuery and XPath 2.0, arguing that these were the minimum necessary:
To sum it up rather briefly, I think XQuery 1.0 is essentially XPath 2.0 plus
- element and attribute constructors
- function definitions
- strong typing
Of course (1) is available in a different form in XSLT 1.0, and (2) is available in XSLT 2.0, so you could say that apart from syntax, XQuery is XSLT plus strong typing minus template rules. Those might seem small differences, but I happen to agree with those who believe that the addition of strong typing and the absence of template rules are both very important when it comes to optimizing a query to execute against a large XML database with pre-defined indexes. So I think we're getting closer to the point where the differences between XSLT and XQuery are the minimum differences required by their different areas of application.
Kay also noted that that while he hoped that that there could be greater convergence at the syntactic level between XSLT and XQuery, this was unlikely on both technical and social grounds; people have "strong feelings" about syntax.
Lack of Updates
The central theme to the recent debate has been the features (or lack thereof) that XQuery 1.0 will provide. In the first version, XQuery will only support querying of a repository, there will be no functionality for inserting, updating, or deleting data. A number of XML-DEV members said that this falls short of the minimum functionality required of a query language. While XQuery will provide a replacement for XPath-based querying, it doesn't include the other operations required for basic manipulation of database data.
During the discussion, Michael Rys noted that the Working Group does plan to add the additional features, but these will not be part of XQuery 1.0 to avoid delaying the initial release. This was later confirmed by Jonathan Robie who explained that members of the Working Group had been working on a proposal for update syntax and semantics, and that versions had already been prototyped by some vendors. This proposal was presented by Robie at XML 2001, and an implementation is included as part of the Microsoft XQuery demonstrator. Robie continued by asserting that while he personally saw updates as a high priority, stability was an important factor determining release schedules:
Whatever we release in XQuery 1.0, it has to be as solid as possible. That means leaving out features that we don't think we have given adequate review or for which there are not enough implementations. If you look carefully at the entire set of XQuery 1.0 specifications, you will see that there is a *lot* of technical content, and we have to make sure that it is well-specified and consistent across the specifications. Coordination between XQuery and XPath also takes time.
For me personally, updates are an extremely high priority. I am concerned about the likelihood that several similar implementations may hit the market before there is a standard for updates. But I am also very concerned that XQuery 1.0 be released relatively soon.
So would you still want to see updates in XQuery 1.0 if it meant releasing XQuery six months later?
Robie's point about the appearance of non-standard update extensions was echoed elsewhere. Soumitra Sengupta observed that it could cause "a lot of confusion and headache for users". Experience suggests that this could become a very real problem, indeed even the appearance of partially standardized functionality can cause headaches. Might early implementations of variations of an update proposal cause similar problems in the XML database community as Microsoft XSL did to XSLT users?
Ensuring stability does suggest the avoidance of premature standardization, yet the best review always comes from implementation. There are a growing number of XQuery implementations: Robie posted a list of twelve later in the discussion, two of which already support updates in some form. As there seems to be a lot of desire to see XQuery implemented, this energy could be harnessed to ensure a thorough review of all features, including updates, to the benefit of all.
While Robie preferred an interim release if it would take six months to specify the full set of features, not everyone thought that this was too long to wait. Some, including Dare Obasanjo, wondered why there's such a rush?
Also in XML-Deviant |
|
Why is the quick release of XQuery so important? Are there really that many businesses that stand to lose that much money if they have to use XPath for a few more months instead of jumping to ... XQuery?
I keep hearing about how important it is that XQuery is released quickly but have yet to see the justification for rushing out the spec with significant functionality missing.
Jonathan Robie responded that XQuery as a whole had hardly been rushed. He also indicated that W3C resourcing and pressure from vendors were also factors influencing an early release.
The XQuery WG was chartered in September of 1999. I would not necessarily say that 2 1/2 years is "quick" in Internet time, though it does, of course, take time to develop things right. Since companies and universities are funding the manpower to develop the specification, we do need to be able to finish specifications in a time frame considered reasonable by the people who are volunteering to pay the people to do the work.
...I think that a number of database companies think they are likely to lose money if they do not have XQuery in their upcoming product cycle.
A cynical view might suggest that a six month interim release also fits nicely into a product release schedule.
However, Robie did suggest that attempts to lobby the W3C to include updates in XQuery 1.0 may be well received.
I would say that keeping pressure on the W3C to have updates in XQuery is the right way to go. I think that the W3C is likely to be responsive, and there are certainly vendors on the WG that want to make this happen.
Developers are invited to direct their comments to the www-xml-query-comments@w3.org mailing list. Significant lobbying from the community seems to be only way to ensure that XQuery is delivered with anything like what is available in SQL.