Examining WSDL
May 15, 2002
Unlike today's Web, web services can be viewed as a set of programs interacting cross a network with no explicit human interaction involved during the transaction. In order for programs to exchange data, it's necessary to define strictly the communications protocol, the data transfer syntax, and the location of the endpoint. For building large, complex systems, such service definitions must be done in a rigorous manner: ideally, a machine-readable language with well-defined semantics, as opposed to parochial and imprecise natural languages.
It is possible to define service definitions in English; XML-RPC and the various weblogging interfaces are a notable example. But XML-RPC is a very simple system, by design, with a relatively small set of features; it's ill-suited to the task of building large-scale or enterprise applications. For example, you can't use XML-RPC to send an arbitrary XML document from one system to another without converting it to a base64-encoded string.
Almost all distributed systems have a language for describing interfaces. They were often C or Pascal-like, often named similarly: "IDL" in DCE and Corba, MIDL, in Microsoft's COM and DCOM. The idea is that after rigorously defining the interface language, tools could be used to to parse the IDL and generate code stubs, thus automating some of grungier parts of distributed programming.
The web services distributed programming model has an IDL, too; and as you can probably guess, it's the Web Services Definition Language, WSDL. It's pronounced by spelling out the letters or saying ``whizz-dell,'' which nearly rhymes with ``diesel.''
WSDL derives from two earlier efforts by a number of companies; the current de facto standard is a W3C Note submitted by IBM and Microsoft. There's a web services description working group, which is creating the next version of the note for eventual delivery as a W3C standard. So far the group hsa published a requirements document and some usage scenarios. One reason to like the requirements document is that it renames some of WSDL's more confusing terms.
I find WSDL to be a frustrating mixture of verbosity -- most messages are essentially described three times -- and curious supposedly helpful defaults, such as omitting the name of a message in an operation. I'll use the now much-discussed Google WSDL to point some of these out.
But, first, let's look at the state of web services programming and IDLs. In the classic IDL world, the definitions were processed by an IDL compiler to generate stubs for clients, which look like local function calls, and dispatch routines for the server that invoke the developer's code. When new applications were developed, the interfaces were designed from scratch, and all the benefits of ``contract'' programming were possible, including clean and regular definition of function semantics.
But back in the real world, these distributed systems usually had to interact with existing systems. Often the project involved ``remoting'' an existing application by putting an RPC interface on an existing service. In cases like this, the IDL files could resemble compiler torture-tests, as network-oriented interface languages were coerced into supporting legacy applications.
I mention this because it's about the stage at which WSDL and web services are today. The most widespread tools take an existing Java class or COM object and then generate a WSDL definition. This is backwards and half-assed. It's backwards because -- as we should have learned with earlier infrastructures -- the right thing to do is write the interface first. It's half-assed because while everybody's generating interfaces, nobody is capable of automatically consuming them.
So how did we get here? I can think of two way. First, the vendors recognize that folks aren't going to throw out their existing code. Just like they put an HTTP front-end on legacy applications when the Web became popular, developers are now going to want to put a web services front-end on their existing code. The reason why we don't yet have good client development -- i.e., WSDL parsing -- is that it requires being able to turn arbitrary XML Schema definitions into useful stubs, which is hard for a couple of reasons.
First, it's not clear how to fit web services programming into existing client frameworks. If you're using SOAP RPC, then all of the classic problems of IDL-based computing come back: memory management, transient network errors, etc. If using SOAP to send XML documents, then new issues such as DOM and SAX support must be dealt with.
Second, WSDL prefers to use XML Schema to define the data to be transferred, and understanding XML Schema requires a significant amount of effort. As the experiences of the ``soapbuilders'' group (a mailing list of SOAP toolkit providers, working on achieving interoperability across their implementations) has shown, it can require a great deal of work just to be able to properly handle XML Schema's primitive types.
I think there's a third reason, but one that nobody will admit in public. No vendor wants to spend the enormous effort involved in developing client-side WSDL toolkits when Microsoft can practically wipe them off the desktop by providing one of their own. Yes, I realize that this ignores peer-to-peer and servers talking to subservers, but I still stand by the statement.
GoogleSearch.wsdl
It's time to examine parts of GoogleSearch.wsdl
, which is part of the Google developer's kit. A WSDL description is a set
of:
- Type definitions, contained in the
types
element, used to describe the data being exchanged, and can be any description language, although -- and I swear I'm not making this up -- "WSDL prefers" XML Schema. - Message definitions, appearing as multiple
message
elements. As we'll see, message definitions are where we get the first hints that WSDL exceeds the 80/20 rule of flexibility and complexity. - Operation definitions, appearing within a
binding
element, which confusingly defines something called a ``port.'' - A service definition, contained in the
service
element. This defines the endpoint (URL) where the server can be found, and -- by referring to the binding, err, port, -- specifies how to communicate with it.
The Google WSDL file defines a SOAP RPC interface, which means it follows the encoding
rules found in Section 5 of the SOAP 1.1 specification. I'll avoid the SOAP vs. REST
discussions now, other than to mention that RPC is a familiar programming model to
many
developers. Conceptually, the GoogleSearchResult
element resembles the
following fragment of a C/C++ object:
bool documentFiltering;
char* searchComments;
int estimatedTotalResultsCount;
bool esimateIsExact;
ResultElementArray resultElements[];
int _numresultElements;
Note that my hypothetical Schema to C mapping required the addition of a new element to keep track of the size of the array.
More interesting is the way the interacting specs require Google to define the
ResultElementArray
datatype. According to the SOAP RPC encoding rules, arrays
are written by generating each element inside a container. XML Schema requires the
container
to be declared as its own type. SOAP 1.1 requires arrays to have a defaultable attribute
that declares the type and size; SOAP 1.2 rightly divides this into two separate attributes.
I say ``rightly'' because XML Schema doesn't have a way to let you default an attribute
value in the SOAP 1.1 style. Because of this, WSDL provides its own arrayType
attribute that does provide a default.
Taking all of this together, the fairly straightforward ResultElementArray
array requires the following contortions:
<xsd:complexType name="ResultElementArray"> <xsd:complexContent> <xsd:restriction base="soapenc:Array"> <xsd:attribute ref="soapenc:arrayType" wsdl:arrayType="typens:Resultelement[]"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType>
More from Rich Salz |
Given all of this complexity, we shouldn't be surprised that Google apparently missed
the
text that said the element should have been named ArrayofResultElement
.
It's also hard not to look at that fragment and despair. All that complication, just to say "we're sending an array." Unfortunately, since WSDL is caught between two other specs, there seems little else that could be done. The WSDL authors couldn't change SOAP, since they were defining a use for it, and one can only imagine the howls if they tried to modify XML Schema.
Let's now look at some message definitions. The following two definitions define a request message and its response. Because they are labeled as ``opname'' and ``opnameResponse,'' WSDL will let us default those names later on.
<message name="doGetCachedPage"> <part name="key" type="xsd:string"/> <part name="url" type="xsd:string"/> </message> <message name="doGetCachedPageResponse"> <part name="return" type="xsd:base64Binary"/> </message>
In the list above, I said we have our first hint about WSDL's excessive flexibility.
First, a message
is intended to be an abstract definition -- that specifies
nothing about the bytes on the wire. As the spec concedes, however, "in some cases,
the
abstract definition may match the concrete representation very closely or exactly."
When
sending XML the representation is exact, and it should be possible to omit
message
's altogether. (A Google search for "optmize the common case" finds
over 300,000 hits.)
The message
element also shows too much flexibility. The individual message
parts can be specified in-line, they can reference a type from the types
section, it can have a mix of name and type declarations, and so on. Can anyone look
at the
doGoogleSearch
message and the GoogleSearchResult
datatype, and
give a good, practical rationale for the style differences?
And why don't all message
elements appear in their own container?
A WSDL operation
is defined as a set of message exchanges. WSDL supports
two-party communication, and four operation types are defined (single incoming, single
outgoing, incoming request with response, outgoing request with reply), although only
the
obvious two are currently supported: client sends message, client sends message and
server
responds.
Here is an abstract operation definition -- remember, we don't yet know how bytes appear on the wire -- that uses the earlier message formats:
<operation name="doGetCachedPage"> <input message="typens:doGetCachedPage"/> <output message="typens:doGetCachedPageResponse"/> </operation>
The messages have a name; WSDL defaults them as described above.
Finally, we're ready to bring these abstract messages and datatypes down to earth.
This is
done in the binding
element, which has its own set of operation
definitions:
<binding name="GoogleSearchBinding" type="typens:GoogleSearchPort"> <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/> <operation name="doGetCachedPage"> <soap:operation soapAction="urn:GoogleSearchAction"/> <input> <soap:body use="encoded" namespace="urn:GoogleSearch" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </input> <output> <soap:body use="encoded" namespace="urn:GoogleSearch" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </output> </operation>
This is where the fairly clean WSDL elements fall apart into a set of nasty special-case
elements. For that reason alone, I can appreciate why binding
is its own
element, but I still think the separation comes at the cost of too much redundancy
and
repetition. For example, notice the duplication of attributes in each soap:body
element.
The code
element is an example of excessive flexibility. First, we've already
declared our intent to use SOAP RPC encoding, through the style
attribute in
the previous element. While SOAP defines document and RPC styles, WSDL doubles this
to
define ``document/encoded'', ``document/literal'', ``rpc/encoded'', and ``rpc/literal''.
Finally, the service
element ties the abstract messages and their concrete
realization together with an endpoint (in this case, a SOAP URL).
So we now know what to send, and where to send it. Next month we'll write some code to do just that.