Functional Programming and XML
February 14, 2001
(A French translation of this article is available).
As is all too common in the programming world, much of the XML community has identified itself and all its works with object oriented programming (OOP). While I'm a fan of OOP, it's clear to me that even the best OOP-for-XML techniques aren't a panacea, and, moreover, there is an awful lot of ad hoc "objectification" which tends merely to make our lives more difficult and our programs less clear.
This short-sightedness has two negative consequences: it tends to limit the techniques and tools we use, and it tends to restrict our understanding. For example, although the Document Object Model (DOM) satisfies few and inspires fewer, its status as a standard tends to inhibit (though, fortunately, not to suppress) exploration into alternative models and practices. The debate tends to revolve around "fixing" DOM, which is cast in terms of getting a better object model. While a better object model would be nice, it's important to keep in mind that XML is neither historically nor inherently object-oriented. Thinking otherwise may lead you to perceive or anticipate a better fit than you actually get.
One cure for intellectual myopia is to go exploring. In this article, I provide a beginner's travel guide to the interesting and instructive land of functional programming (FP) and XML.
Why Functional Programming?
If you are heavily into XML, and use a wide range of XML-related technologies, it's highly likely that you are already doing some FP. XSLT is more or less the transformation language of DSSSL, in an XML syntax, which is a proper subset of DSSSL which, itself, is a purely functional subset of the Scheme programming language (plus a large library). Thus not only are there are historical connections between FP and XML, but critical present day XML tools involve FP.
This connection is a bit more than accidental. XML is generally declarative, as are functional programming languages. XML is a metalanguage -- a language for defining languages. ML, one of the big time functional languages, is so named as an abbreviation for "Meta-Language". In general, FP languages excel at language definition and implementation. It's natural to think of XML documents as trees, and functional languages tend to have lots of nifty facilities for representing and manipulating trees. Finally, to wrap up the vagaries, the goal of working at the very high level of structural mark-up, where we specify documents in terms of their logical features rather than particular rendering procedures, is similar in spirit to the ideals of FP, where we strive to specify computations in mathematical terms rather than machine- or recipe-oriented terms.
One need not take up with a fancy-pants FP language in order to benefit from FP tropes,
techniques, and thinking. For example, the popular scripting languages Python (see
the
functional.py
modules of the Xoltar Toolkit) and Perl (see
this wonderful article among the many
on this and other topics by M-J. Dominus) both have
several FP-inspired features and can be used to experiment with FP techniques. However,
it's
worth investigating languages and systems designed from the ground up to support functional
programming, if only since those most expert in FP tend to use them for their own
XML
work.
(Note: I recommend http://lambda.weblogs.com/ for folks interested in a reasonably soft-core FP resource.)
General Orientation
As the name implies, functional programming is "function-oriented" programming (though C doesn't really count). FP languages allow you to manipulate functions in arbitrary ways: building them, combining, them, passing them around, storing them in data structures, and so on. And it's key that FP functions are (at least, ideally) side-effect free. FP's declarative nature makes it much easier to deal with things like concurrency, data structure transformations, and program verification, validation, and analysis. Functions are to FP what objects are to OOP.
While there are many ways to classify functional languages according to technical features and formal properties, such divisions tend to be more useful to the FP experienced, degenerating into mere buzzwords for the novice. I have a rough and ready way of dividing FP languages which seems to capture how they feel to the casual FP dilettante:
- Lispy languages, especially Scheme (and, somewhat atypically, Rebol and Dylan).
- Type-obsessed languages, such as ML, OCmal, Clean, and Haskell.
- Prolog-derived languages, such as Erlang, Mercury, and Oz.
In spite of my hope that the XML.com readership contains no members of the FP community -- if just for the sake of my ambitions for "gatekeeper" status -- I recognize this to be an unrealistic fantasy. So, let me forestall vicious discussion-group comments by pointing out that Mercury is very type obsessed, that many of these languages are multi-paradigm, and that Common Lisp has plenty of nice functional features. As for the rest of you, I'm a tour guide, not a guru.
While I could generate specific criteria for these categories (e.g., Lispy languages tend to be based on the untyped lambda calculus, whereas type-obsessed languages on typed variants), what I really use to make this division is my impression of the "look and feel" of the languages. So while Rebol doesn't use classic s-expression syntax (and thus doesn't tend to look like a Lisp and, indeed, tends to block or make difficult a bunch of classic Lispy moves) I find myself, when using it, thinking much the way I would were I using a Scheme system. Thus, if you master one language in a category, the rest are unlikely to throw you off your stride.
The other Big Thing about FP is that it tends to be the subject of quite a bit of academic research, to the point that FP, in general, tends to be dismissed as ivory tower nose blowing. In fact, there are several commercially supported implementations of various FP languages, many of the free compilers and environments are very good quality, and FP is used in many high powered production environments. At the other end of things, there are a fair number of friendly FP learning systems, ideal for a bit of dorking around without a large investment of time, energy, or installation havoc.
But the academic nature of FP is also an asset. While preparing this article, I read a slew of papers -- some of which described systems (e.g., XMLambda) not yet fully realized or released. However, the ideas and their presentation stretched my perspective on XML processing in refreshing ways. As I stated earlier, it's not just the lack of tools, but the constraints on our understanding, that we need to cure.
The rest of this article highlights some of the interesting features of three FP systems for processing XML: XMLambda (an XML-specific FP language), HaXML (XML facilities for Haskell), and the new XML support in Erlang.
XMLambda
XMLambda is one of a number of special purpose languages, (or, rather, since there isn't a publicly available implementation, language design) for dealing with XML to come out of the FP community (other examples include XDuce and various query languages). In particular, XMLambda is a language optimized for composing "dynamic" XML documents, i.e., documents where some portion is computed rather than entered. The neat thing about XMLambda is that it's basically the result of asking: "What would a type-obsessed functional language, where an XML document simply was a program, look like?" This means that the minimally XML savvy person can already write XMLambda programs, such as
<html> <head> <title>Hello world in XMLambda</title> </head> <body> <h1>Hello world</h2> </body> </html>
It's important to recognize that that this is a fairly complex program, not just a string literal, in XMLambda. Each element is a program construct with an associated type. Furthermore, validation of the XML is accomplished by type-checking. Validating the XML is just a matter of compiling the program.
Compare this, for a moment, with various embedded scripting language template systems, such as JSP or ASP. In most of these systems, the surrounding XML is just text, element tags and content alike -- semantically speaking, you could replace it with a series of write statements (or one big one or some concatenations). More importantly, the write statements wouldn't have to respect, for example, tag boundaries. It'd be perfectly legitimate to embed something like
out.writeln("<title" + ">Hello world<" + "</title>" );
In the degenerate case of a pure XML document, the XMLambda compiler just is a validating XML parser, and we have plenty of those. However, since XMLambda's validation is actually part of a more general type checking system, even documents ("templates") with embedded computations can be "validated":
myTitle :: string myTitle = "Hello world in XMLambda" <html> <head> <title><%= myTitle %></title> </head> <body> <h1>Hello world</h2> </body> </html> |
Variable myTitle is of type string and thus can go inside a title
element. |
myTitle :: title myTitle = <title>Hello world in XMLambda</title> <html> <head> <%= myTitle %> </head> <body> <h1>Hello world</h2> </body> </html> |
Now, myTitle is of type title element, and so can appear inside the
head element... |
myTitle :: title myTitle = <title>Hello world in XMLambda</title> <html> <head> <%= myTitle %> </head> <body> <h1><%= myTitle %></h2> </body> </html> |
... but not inside an h1 element. This would be caught by the
compiler. |
If I'd declared myTitle
to be a string in the third program, there still would
have been a compile time type error, since what gets assigned to myTitle
is an
element. To subvert the type checks, I would have to make the right hand side of
the assignment a string literal, as opposed to an element literal. That case would
be pretty
much equivalent to a standard ASP or JSP template (although JSP tag libraries make
this a
bit more complicated). In standard ASP/JSP templates you tend to be working at the
fairly
low level of string munging, which not only forgoes automatic validation, but also
forces
the programmer to keep three separate models in mind at once: the XML world of elements,
the
programming language's datatypes and constructs, and the string hacking interface
between
the two.
The standard solution is to use element node objects, or element generating procedures, to provide a unified, more abstract, view of the dynamic document. This can work, and you can even build in some sorts of validation (e.g., have nodes check the classes of their children, etc.). However, it's beastly awkward, mainly because (as the XMLambda authors point out) it forces you to too high a level of abstraction, not to mention a much more cumbersome notation. XMLambda lets you stay XML-centered while providing a consistent and fairly natural way to build complex, generated documents.
HaXML
Like XMLambda, the HaXML utilities treat a DTD as a series of type declarations in a functional language, thus conflating validation with type checking. In the case of HaXML, the language is Haskell, a general purpose FP language with many interesting features and implementations. While lacking the XML-centeredness of XMLambda, HaXML's mapping libraries allow one to smoothly work with XML data and documents from within a complete programming language with an economical and perspicuous notation.
One interesting claim I've seen in the literature is that the base Haskell type system -- while very expressive, especially in comparison with C, C++, or Java type systems -- may prove either insufficient or just unwieldy for XML documents. And it's this insufficiency that's one reason for the design of XML-specific languages with special type systems (such as XMLambda and XDuce). However, the Haskell type system is well-understood, well-documented, and well-implemented, all of which make it ideal for exploring more comprehensive type schemes for XML.
The HaXML utilities introduce another, distinctively FP, mechanism for manipulating XML: combinators. Combinators, or higher-order functions, are used to build complex functions out of simpler ones. They are "high-order" because they take functions as arguments and return a function as a result. Conceptually, they are much like Unix pipes in that they let you build up more complex computational sequences by flexibly arranging highly specialized tools. Indeed, the HaXML package defines the "Irish composition" combinator (represented by "o") which applies one function to the results of another (just the way a pipe "applies" one program to the results of another). However, there are many other combinators which do such useful things as append the results of one function to the other in various ways or use one function to select results from another. The nifty thing about combinators is that they're just functions defined and definable in the host language. Thus, as you find yourself using a certain pattern, it's easy to abstract it into a new combinator. Indeed, many of the predefined combinators in HaXML are such abstractions.
The processing model should be clear. First, HaXML defines a number of basic filtering
functions, including elm
which returns the input if it is an (arbitrary) XML
element, children
which, if the input is an element with children, returns
those children, and txt
which passes only plain text. Then, it adds a few
filter constructors for building specialized filters, e.g., tag
for building
filters for specific tags or mkElem
for transforming a string into an element.
Finally, HaXML defines a combinator library for stringing these basic filters -- and
any
generated ones, of course -- together in various ways. The result? A domain specific
language for parsing, filtering, transforming, and generating XML documents. Given
the
side-effect free nature of these functions, you can treat them as very strongly encapsulated
components that can be mixed and matched freely. In fact, the HaXML paper describes
various
algebraic laws for their combinators which you could use to transform complex expressions
manually into something more readable or could form the basis of a filter optimizer.
There's a very cumulative flavor to this approach: Each bit can be achieved via some more verbose technique that trades notation economy for a reduced number of constructs. But there's no need since the extra constructs are strictly defined in terms of the core set. The algebra let's you use derivations to figure out equivalences, rather than trying to keep a mental model of the processing flow.
Some folks might find the non-XML syntax of these combinators a welcome alternative to XSLT. Personally, I find the trend toward XMLizing everything -- especially of end-programmer languages -- to be rather a step backwards. After all, the designers of XML went out of their way to eliminate all the author support features of SGML.
These examples were taken from the first "Transforming XML" column. I don't promise that the semantics are exactly alike, for several reasons, including the need to pretty print the output of the combinators, but they should help give some of the flavor of the combinators.
Copying Elements to the Output
XSL | HaXML |
<xsl:template match="title"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> |
multi(tag "title") --"multi" is one of the recursive combinators --it collects a list of the results of applying --its argument to each node in the tree |
Deleting Elements
XSL | HaXML |
<xsl:template match="nickname"> </xsl:template> |
foldXml(keep `without` tag "nickname") --"foldXml" is another recursive combinator --it applies its argument to each node starting --from the bottom up |
Changing Element Names
XSL | HaXML |
<xsl:template match="article"> <html> <xsl:apply-templates/> </html> </xsl:template> |
foldXml(mkElem "html" [children] `when` tag "article") --This is the recursive version. It's harmless if there's only --a top level "article". It's probably better to just do: mkElem "html" [children] `when` tag "article" --Of course, there's the nicer: replaceTag "html" `o` tag "article" |
Aside from writing transformations directly with these combinators, they also make a good compilation target for other query, style, and transformation languages. (For example, see Xtract, a XML query language similar to XPath.)
ErXML
Erlang is generally considered to be the industrial poster child of FP. Born at Ericsson, designed to handle huge, fault-tolerant, highly concurrent applications, evolved under pressure of mission-critical, production telephony apps, Erlang is a worker's FP language (though, there is a good bit of interesting academic research done on and with it). Erlang is a very friendly language to get going in. Indeed, Erlang has a lot of the feel of Smalltalk (okay, "of Python") -- simple core language, dynamic typing, and extensive libraries.
Erlang has many of the syntactic and semantic characteristic features of FP: higher order and referentially transparent functions, list comprehensions, single assignment variables, pattern matching, and so on. It has a very clean and easy to use module system. Its outstanding feature, of course, is seamless support for concurrency. In fact, many Erlang advocates wax enthusiastic about "process oriented" design.
At the Sixth International Erlang/OTP User Conference, Ulf Wiger introduced what looks to become Erlang's canonical XML processing tool, XMerL (there is an earlier toy parser and a binding for the Edinburgh LT XML toolkit as well). Though still in early beta, XMerL is nevertheless immediately useful and has already been used to implement an XML datastore on top of Mnesia, Erlang's distributed, fault-tolerant database system.
Here are three reasons to start playing with XMerL:
- Erlang makes a good starting ground for playing with XML and FP. Those with Scheme or Lisp experience should find it comfortable enough, and it's "non-parenthesized" enough not to turn off others. Probably the trickiest bit is the necessity of using tail-recursion for loops, but, in Erlang, I haven't seen it to be as contorting in practical use as the standard Scheme examples.
- The XMerL set itself has several interesting features, such as a highly customizable parser (with hooks at various stages of the entire parsing process and within parsing events).
- The Erlang distribution comes with a number of applications suited for building large scale web sites, most notably Mnesia and Inets (an Apache-compatible HTTP server). Other Erlang-based projects of interest include Eddie (distributed web farm software), the various Bluetail products, and IDX-xmensia, an XML datastore. What starts as a prototype or toy today can easily turn into a robust production application in Erlang. Erlang is good that way.
XMerL is not as up to snuff with regard to standards compliance as the latest and greatest mainstream XML tools. But it, as part of a general Erlang solution, can fill many practical niches.
Uncharted Lands
There's a lot more going on in the FP world, and even in these particular examples, than I've touched on. Ideas from the functional programming world have always percolated into mainstream practice, but we seem to be reaching a point where many FP techniques and tools are poised for wholesale -- or at least retail -- acceptance. For example, James Clark's recently proposed Trex validating language for XML is (in his own words):
basically the type system of XDuce with an XML syntax and with a bunch of additional features (like support for attributes and namespaces) needed to make it a practical language for structure validation
While it's clear that as long as James Clark is around most of us will have the chance to benefit in some ways from FP innovations, there is still much to be gained from direct familiarity with FP techniques. Even if the tools get imported, there remains the issue of understanding them -- getting a good grip on these imported tools is often furthered by a more general familiarity with the practices and theories which produced them.
One serious lack in the FP world is large quantities of good and gentle documentation, especially of the sort designed to get programmers steeped in other paradigms (e.g, imperative, OOP) up to speed. Much of the jargon of FP is unfamiliar, being derived from advanced logical and mathematical terminology (indeed, there is a current thread on comp.lang.functional with the subject, "A Plea for Plain Functional English and Syntax"). While the situation is continually improving, it's important to recognize that deep and full mastery of FP is not a trivial endeavor even in the most optimal circumstances.
Fortunately, one doesn't need a deep and full mastery of FP to benefit from studying it. I've found that, at least for dealing with XML, a little bit of dabbling goes a long way.