An Introduction to the XML:DB API

January 9, 2002

In my last article, Introduction to dbXML, I provided an example that used the XML:DB API to access the dbXML server. This time around we'll take a more detailed look at the XML:DB API in order to get a better feel for what the API is about and how it can help you build applications for native XML databases (NXD).

The Proliferation of Native XML Databases

Currently, there are about 20 different native XML databases on the market. Among them are commercial products such as Tamino, X-Hive and Excelon. And open source NXDs include dbXML (now renamed Apache Xindice), eXist, and Ozone/XML. While this selection is a nice thing to see in an emerging market, it makes developing applications quite a bit more difficult. Each NXD defines its own API which prevents the development of software that will work with more then one NXD without coding for each specific server. If you've worked with relational databases, then you've likely worked with ODBC or JDBC to abstract away from proprietary relational database APIs. The goal of the XML:DB API is to bring similar functionality to native XML databases.

Status

The XML:DB API project was started a little over a year ago by the XML:DB Initiative and is currently still evolving. Most of the core framework is stable, and it has already been implemented by dbXML/Xindice and eXist. There's also a reference implementation in Java available, and there are several other implementations in progress, including some for commercial databases. The latest information on implementations can be found on the XML:DB API project site.

Basic Concepts

While the XML:DB API is simple to use once you're familiar with it, there is some introductory terminology and concepts that we need to discuss first.

Drivers

Each database that supports the XML:DB API must provide a database specific driver that encapsulates all the database access logic. Drivers are implementations of the Database interface and are managed by the DatabaseManager. If you're familiar with JDBC, ODBC, or SAX then the driver concept should also be familiar and it doesn't differ much in the XML:DB API.

Collections

In native XML databases collections are the containers in which XML documents are stored. Compared to a relational database, a collection is roughly equivalent to a table. The XML:DB API makes extensive use of the collection concept and assumes that any implementing database has at least one collection where documents are stored. Collections are represented in the API by the Collection interface.

Services

The XML:DB API is designed to be very flexible and extensible. This capability is achieved through services. In fact, you can't do a whole lot of useful work with the API without services. The most widely used example of a service is the XPathQueryService. As its name implies, this service enables execution of XPath queries against the database. The API specification defines several services including XPathQueryService, XUpdateQueryService, and CollectionManagementService. In the future services will be added for W3C XQuery and other specifications as needed. The service mechanism is also open for vendors to add custom services, as long as it is clear that those services are not portable.

Resource Abstraction

Since there are several common ways of working with XML data, the XML:DB API defines an abstraction for the content stored in the database. This abstraction is encapsulated in the generic Resource interface. By specializing Resource it's possible to support other types of data beyond XML, for example, binary data. For XML the XMLResource specialization is provided and allows you to easily access and update the underlying XML data as either textual XML, a W3C DOM, or a SAX event stream.

API Core Levels

Because the XML:DB API is designed to be modular, it's necessary to group features together to form baselines for application developers to work against. These baselines are called core levels, and there are currently two defined in the API specification. Core Level 0 is the base API that all drivers must implement. It includes the basic interfaces for collections, resources, and services. Core Level 1 extends Core Level 0 to include the XPathQueryService. The idea is that applications will require a particular Core Level of driver support. Then, if an application wants to use additional API elements, it should test for, before using, them.

Putting it all Together

Let's examine an example program that illustrates how everything fits together. Since this is an introductory article we need to keep things very simple. Our example won't do anything useful, but it will put all the concepts to work. If you read my article on dbXML, this program is a more generic version of the example included there. You should refer to that article for more information on setting up a dbXML repository to work with this program.


import org.xmldb.api.base.*;

import org.xmldb.api.modules.*;

import org.xmldb.api.*;



public class Query {

   public static void main(String[] args) throws Exception {

      Collection col = null;

      try {

         String driver = null;

         String prefix = null;

         if ( ( args.length == 1 ) && args[0].equals("dbxml") ) {

            driver = "org.dbxml.client.xmldb.DatabaseImpl";

            prefix = "xmldb:dbxml:///db/";

         }

         else {

            driver = "org.xmldb.api.reference.DatabaseImpl";

            prefix = "xmldb:ref:///"; 

         }

         

         Class c = Class.forName(driver);

         

         Database database = (Database) c.newInstance();

         if ( ! database.getConformanceLevel().equals("1") ) {

            System.out.println("This program requires a Core Level 1 XML:DB " +

               "API driver");

            System.exit(1);

         }

         

         DatabaseManager.registerDatabase(database);

         col =

            DatabaseManager.getCollection(prefix + "addresses");

   

         String xpath = "/address[@id = 1]";

         XPathQueryService service =

            (XPathQueryService) col.getService("XPathQueryService", "1.0");

         ResourceSet resultSet = service.query(xpath);

         ResourceIterator results = resultSet.getIterator();

         

         while (results.hasMoreResources()) {

            Resource res = results.nextResource();

            System.out.println((String) res.getContent());

         }

      }

      catch (XMLDBException e) {

         System.err.println("XML:DB Exception occurred " + e.errorCode + " " + 

            e.getMessage());

      }

      finally {

         if (col != null) {

            col.close();

         }

      }

   }

}

Let's look at the various parts and pieces. For this particular program we've hard-coded driver configurations for dbXML and the XML:DB API reference implementation. Two pieces of information are required here, the name of the driver implementation class and the driver specific URI prefix. In a real program you would probably want to read these values from a configuration file. In this section of code we can also see a check to insure that the driver supports Core Level 1. The way the check is coded will work with current drivers, but this is an area where there are likely to be changes to the API in the future.


   String driver = null;

   String prefix = null;

   if ( ( args.length == 1 ) && args[0].equals("dbxml") ) {

      driver = "org.dbxml.client.xmldb.DatabaseImpl";

      prefix = "xmldb:dbxml:///db/";

   }

   else {

      driver = "org.xmldb.api.reference.DatabaseImpl";

      prefix = "xmldb:ref:///"; 

   }

   

   Class c = Class.forName(driver);

   

   Database database = (Database) c.newInstance();

   if ( ! database.getConformanceLevel().equals("1") ) {

      System.out.println("This program requires a Core Level 1 XML:DB " +

         "API driver");

      System.exit(1);

   }

   DatabaseManager.registerDatabase(database);

Now that the driver is configured for our chosen database, we need to request a Collection instance. The parameter to getCollection() consists of the fully qualified URI as defined by the specific driver. In this particular case the collection we want is named addresses and is simply appended to the driver specific prefix we defined earlier.


   col =

      DatabaseManager.getCollection(prefix + "addresses");

Now we want to run a query against the retrieved collection so we need to get an XPathQueryService implementation from that collection. Other API services are accessed in a similar manner.


   XPathQueryService service =

      (XPathQueryService) col.getService("XPathQueryService", "1.0");

Each service defines a custom interface for the operations that it performs. In the case of the XPathQueryService it defines a query method that returns a ResourceSet containing the query results.


   ResourceSet resultSet = service.query(xpath);

We just want to print out the results so we iterate through the result set using a ResourceIterator. Each query result is encapsulated as an XMLResource and the simplest way to print the XML content is to just retrieve it as text. For this we call getContent(). However, if we wanted to get the XML as a DOM tree we could call getContentAsDOM(), or we could setup a SAX ContentHandler implementation and call getContentAsSAX() to retrieve it as a SAX event stream. This is one of the nicest features of the API.


   ResourceIterator results = resultSet.getIterator();



   while (results.hasMoreResources()) {

      Resource res = results.nextResource();

      System.out.println((String) res.getContent());

   }

What's left is housekeeping to make sure we close our Collection instances. For some drivers this is critical, while for others it doesn't matter. It's always good practice to insure any resources being used by the server are released.


   if (col != null) {

      col.close();

   }

Learning More

• The XML:DB Initiative

• The XML:DB API Project

• The dbXML Project

• The eXist Project

• W3C XPath

Obviously there is much more to the XML:DB API than what's illustrated in this simple example and short article. But I have given you a better idea of what the API is and how it is used. If you want to find out more you should take a look at the XML:DB API site and the dbXML developers guide. The eXist documentation also contains some information about developing with the API.

While there is still a lot of work to do on the XML:DB API, what is available today is already usable and provides a solid framework to build on. In fact, projects like Apache Xindice are using the XML:DB API as the primary Java API for accessing the server. Participating in API development is open to anyone who's interested; feel free to join the project mailing list and contribute to the development of the XML:DB API.