Examining WSDL

May 15, 2002

Unlike today's Web, web services can be viewed as a set of programs interacting cross a network with no explicit human interaction involved during the transaction. In order for programs to exchange data, it's necessary to define strictly the communications protocol, the data transfer syntax, and the location of the endpoint. For building large, complex systems, such service definitions must be done in a rigorous manner: ideally, a machine-readable language with well-defined semantics, as opposed to parochial and imprecise natural languages.

It is possible to define service definitions in English; XML-RPC and the various weblogging interfaces are a notable example. But XML-RPC is a very simple system, by design, with a relatively small set of features; it's ill-suited to the task of building large-scale or enterprise applications. For example, you can't use XML-RPC to send an arbitrary XML document from one system to another without converting it to a base64-encoded string.

Almost all distributed systems have a language for describing interfaces. They were often C or Pascal-like, often named similarly: "IDL" in DCE and Corba, MIDL, in Microsoft's COM and DCOM. The idea is that after rigorously defining the interface language, tools could be used to to parse the IDL and generate code stubs, thus automating some of grungier parts of distributed programming.

The web services distributed programming model has an IDL, too; and as you can probably guess, it's the Web Services Definition Language, WSDL. It's pronounced by spelling out the letters or saying ``whizz-dell,'' which nearly rhymes with ``diesel.''

WSDL derives from two earlier efforts by a number of companies; the current de facto standard is a W3C Note submitted by IBM and Microsoft. There's a web services description working group, which is creating the next version of the note for eventual delivery as a W3C standard. So far the group hsa published a requirements document and some usage scenarios. One reason to like the requirements document is that it renames some of WSDL's more confusing terms.

I find WSDL to be a frustrating mixture of verbosity -- most messages are essentially described three times -- and curious supposedly helpful defaults, such as omitting the name of a message in an operation. I'll use the now much-discussed Google WSDL to point some of these out.

But, first, let's look at the state of web services programming and IDLs. In the classic IDL world, the definitions were processed by an IDL compiler to generate stubs for clients, which look like local function calls, and dispatch routines for the server that invoke the developer's code. When new applications were developed, the interfaces were designed from scratch, and all the benefits of ``contract'' programming were possible, including clean and regular definition of function semantics.

But back in the real world, these distributed systems usually had to interact with existing systems. Often the project involved ``remoting'' an existing application by putting an RPC interface on an existing service. In cases like this, the IDL files could resemble compiler torture-tests, as network-oriented interface languages were coerced into supporting legacy applications.

I mention this because it's about the stage at which WSDL and web services are today. The most widespread tools take an existing Java class or COM object and then generate a WSDL definition. This is backwards and half-assed. It's backwards because -- as we should have learned with earlier infrastructures -- the right thing to do is write the interface first. It's half-assed because while everybody's generating interfaces, nobody is capable of automatically consuming them.

So how did we get here? I can think of two way. First, the vendors recognize that folks aren't going to throw out their existing code. Just like they put an HTTP front-end on legacy applications when the Web became popular, developers are now going to want to put a web services front-end on their existing code. The reason why we don't yet have good client development -- i.e., WSDL parsing -- is that it requires being able to turn arbitrary XML Schema definitions into useful stubs, which is hard for a couple of reasons.

First, it's not clear how to fit web services programming into existing client frameworks. If you're using SOAP RPC, then all of the classic problems of IDL-based computing come back: memory management, transient network errors, etc. If using SOAP to send XML documents, then new issues such as DOM and SAX support must be dealt with.

Second, WSDL prefers to use XML Schema to define the data to be transferred, and understanding XML Schema requires a significant amount of effort. As the experiences of the ``soapbuilders'' group (a mailing list of SOAP toolkit providers, working on achieving interoperability across their implementations) has shown, it can require a great deal of work just to be able to properly handle XML Schema's primitive types.

I think there's a third reason, but one that nobody will admit in public. No vendor wants to spend the enormous effort involved in developing client-side WSDL toolkits when Microsoft can practically wipe them off the desktop by providing one of their own. Yes, I realize that this ignores peer-to-peer and servers talking to subservers, but I still stand by the statement.

GoogleSearch.wsdl

It's time to examine parts of GoogleSearch.wsdl, which is part of the Google developer's kit. A WSDL description is a set of:

Type definitions, contained in the types element, used to describe the data being exchanged, and can be any description language, although -- and I swear I'm not making this up -- "WSDL prefers" XML Schema.
Message definitions, appearing as multiple message elements. As we'll see, message definitions are where we get the first hints that WSDL exceeds the 80/20 rule of flexibility and complexity.
Operation definitions, appearing within a binding element, which confusingly defines something called a ``port.''
A service definition, contained in the service element. This defines the endpoint (URL) where the server can be found, and -- by referring to the binding, err, port, -- specifies how to communicate with it.

The Google WSDL file defines a SOAP RPC interface, which means it follows the encoding rules found in Section 5 of the SOAP 1.1 specification. I'll avoid the SOAP vs. REST discussions now, other than to mention that RPC is a familiar programming model to many developers. Conceptually, the GoogleSearchResult element resembles the following fragment of a C/C++ object:


                        
                        
bool documentFiltering;

char* searchComments;

int estimatedTotalResultsCount;

bool esimateIsExact;

ResultElementArray resultElements[];

int _numresultElements;

Note that my hypothetical Schema to C mapping required the addition of a new element to keep track of the size of the array.

More interesting is the way the interacting specs require Google to define the ResultElementArray datatype. According to the SOAP RPC encoding rules, arrays are written by generating each element inside a container. XML Schema requires the container to be declared as its own type. SOAP 1.1 requires arrays to have a defaultable attribute that declares the type and size; SOAP 1.2 rightly divides this into two separate attributes. I say ``rightly'' because XML Schema doesn't have a way to let you default an attribute value in the SOAP 1.1 style. Because of this, WSDL provides its own arrayType attribute that does provide a default.

Taking all of this together, the fairly straightforward ResultElementArray array requires the following contortions:


<xsd:complexType name="ResultElementArray">

  <xsd:complexContent>

    <xsd:restriction base="soapenc:Array">

       <xsd:attribute ref="soapenc:arrayType"

                      wsdl:arrayType="typens:Resultelement[]"/>

    </xsd:restriction>

  </xsd:complexContent>

</xsd:complexType>

More from Rich Salz

Given all of this complexity, we shouldn't be surprised that Google apparently missed the text that said the element should have been named ArrayofResultElement.

It's also hard not to look at that fragment and despair. All that complication, just to say "we're sending an array." Unfortunately, since WSDL is caught between two other specs, there seems little else that could be done. The WSDL authors couldn't change SOAP, since they were defining a use for it, and one can only imagine the howls if they tried to modify XML Schema.

Let's now look at some message definitions. The following two definitions define a request message and its response. Because they are labeled as ``opname'' and ``opnameResponse,'' WSDL will let us default those names later on.


<message name="doGetCachedPage">

  <part name="key" type="xsd:string"/>

  <part name="url" type="xsd:string"/>

</message>



<message name="doGetCachedPageResponse">

  <part name="return" type="xsd:base64Binary"/>

</message>

In the list above, I said we have our first hint about WSDL's excessive flexibility. First, a message is intended to be an abstract definition -- that specifies nothing about the bytes on the wire. As the spec concedes, however, "in some cases, the abstract definition may match the concrete representation very closely or exactly." When sending XML the representation is exact, and it should be possible to omit message's altogether. (A Google search for "optmize the common case" finds over 300,000 hits.)

The message element also shows too much flexibility. The individual message parts can be specified in-line, they can reference a type from the types section, it can have a mix of name and type declarations, and so on. Can anyone look at the doGoogleSearch message and the GoogleSearchResult datatype, and give a good, practical rationale for the style differences?

And why don't all message elements appear in their own container?

A WSDL operation is defined as a set of message exchanges. WSDL supports two-party communication, and four operation types are defined (single incoming, single outgoing, incoming request with response, outgoing request with reply), although only the obvious two are currently supported: client sends message, client sends message and server responds.

Here is an abstract operation definition -- remember, we don't yet know how bytes appear on the wire -- that uses the earlier message formats:


<operation name="doGetCachedPage">

  <input message="typens:doGetCachedPage"/>

  <output message="typens:doGetCachedPageResponse"/>

</operation>

The messages have a name; WSDL defaults them as described above.

Finally, we're ready to bring these abstract messages and datatypes down to earth. This is done in the binding element, which has its own set of operation definitions:


<binding name="GoogleSearchBinding" type="typens:GoogleSearchPort">

  <soap:binding

      style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/>



  <operation name="doGetCachedPage">

    <soap:operation soapAction="urn:GoogleSearchAction"/>

    <input>

      <soap:body use="encoded"

                 namespace="urn:GoogleSearch"

                 encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>

    </input>

    <output>

      <soap:body use="encoded"

                 namespace="urn:GoogleSearch"

                 encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>

    </output>

  </operation>

This is where the fairly clean WSDL elements fall apart into a set of nasty special-case elements. For that reason alone, I can appreciate why binding is its own element, but I still think the separation comes at the cost of too much redundancy and repetition. For example, notice the duplication of attributes in each soap:body element.

The code element is an example of excessive flexibility. First, we've already declared our intent to use SOAP RPC encoding, through the style attribute in the previous element. While SOAP defines document and RPC styles, WSDL doubles this to define ``document/encoded'', ``document/literal'', ``rpc/encoded'', and ``rpc/literal''.

Finally, the service element ties the abstract messages and their concrete realization together with an endpoint (in this case, a SOAP URL).

So we now know what to send, and where to send it. Next month we'll write some code to do just that.