Advanced XML Applications in Zope
February 23, 2000
Zope is an open source application server that allows you to develop web applications quickly. With it you can develop network services that interoperate via XML.
In this article we'll look at how to build a web application that reads and writes XML. This article further develops themes begun in Creating XML Applications with Zope and Internet Scripting: Zope and XML-RPC.
Design Patterns for XML Applications
Zope publishes objects on the Web. We've seen how you can import XML into Zope, give it behavior, and publish it on the Web.
This model of using XML as Zope objects is very appealing because of its simplicity. You can explore the elements of your XML and call methods on them directly. However, in more complex XML applications, this sort of scheme may not work well. In real-world XML applications you may find the following:
-
You need to adapt heterogeneous data into a coherent object model. For example, you may need to work with XML from different DTDs, or you may need to work with both XML and non-XML data.
-
Your object model may not map well onto the DTD. For example, your DTD may describe documents as containing authors while your object model may see an author as the container of a document.
-
You may wish to use different DTDs to represent the same type of objects. If different DTDs reveal different information about your objects, how can you decide which is the authoritative description of your object?
These problems and others lead us away from using XML directly as application objects.
XML as a View
XML is not necessarily valuable in and of itself; it is useful for providing interoperability between applications. Just because XML is on the wire doesn't mean it is an appropriate internal construct for the objects that provide the network services.
XML provides a view of an object, rather than defining an object directly. An XML application doesn't exist to process XML. Instead it provides value by offering network services that are exposed via XML.
In this article we'll work from a general design pattern for Zope/XML applications that presumes that XML is consumed and produced by applications objects, but not stored internally. Of course, this pattern is not appropriate for all XML applications. For example, applications that edit XML documents or manage XML archives would have good reason to store XML internally. (For further information on XML and design patterns see XML Design Patterns.)
Example Application: RSS Channel Manager
Let's build a simple XML application in Zope to demonstrate how it's done.
For our sample application, we'll construct a web content aggregator. It should be able to gather web content from diverse sources, manipulate and search the content, and serve it back up in XML.
For simplicity's sake we'll restrict ourselves to RSS, which is an XML DTD for the exchange of simple descriptions of web content, such as stories and news items.
The RSS DTD
RSS stands for Rich Site Summary. It is an XML DTD that describes pieces of content, commonly stories or news items, on a web portal such as Slashdot. Netscape uses RSS to present users with a personalized home page on my.netscape.com.
The RSS DTD is defined in terms of channels and items. A channel is a container for items and items are short summaries of web content. Here's a short RSS file.
<?xml version="1.0"?> <!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd"> <rss version="0.91"> <channel> <title>Python Dot Org</title> <link>http://www.python.org</link> <description>The Python language web site. Your source for all things Python!</description> <item> <title>PySol 3.20</title> <link>http://wildsau.idv.uni-linz.ac.at/mfx/pysol.html</link> <description>new version of Python Solitaire Games (using Tkinter); now supports 151 (!) distinct solitaire card game variants.</description> </item> <item> <title>Cryptography modules now available worldwide</title> <link>ftp://starship.python.net/pub/crew/amk/crypto/</link> <description>re-release of (historical) crypto modules; may now be downloaded world-wide due to relaxed US export control policies; however, please use mxCrypto or M2Crypto for new projects instead.</description> </item> <item> <title>Stackless Python 1.0 + Continuations 0.6</title> <link>http://www.tismer.com/research/stackless/</link> <description>a version of Python 1.5.2 that does not need space on the C stack, and first-class callable continuation objects for Python.</description> </item> </channel> </rss>
Architecture
Our basic strategy will be to build Zope objects from remote RSS files. We can then manipulate and query the Zope objects directly, without having to reparse any XML. The Zope objects will leverage the Zope framework to provide facilities for persistence, "through-the-web" management, security, and searching. Finally, the Zope objects will have templates to allow them to represent themselves as HTML and RSS.
Implementation
We will implement these Zope objects in Python. Building Zope objects in Python is unfortunately still not well documented. For background on this subject see the following:
In general it will not be necessary to understand all the details of extending Zope in Python to understand the basic working of the application.
Channel and Item Classes
Our channel class will mimic, to a large degree, the RSS description of a channel. Channels will contain items. Channels and items will both have attributes that are determined by the RSS data.
Figure 1: Channel and Item Class Diagram. (For help in interpreting this diagram, see our simple UML class diagram guide.) |
In our implementation, we'll skip many of the optional attributes of channels and items for the sake of simplicity.
Instead of simply destroying and rebuilding our channel and item objects each time we fetch an RSS description of the channel, we may want to allow our channel to intelligently update itself. This method can allow the channel to archive old items. In effect, our channels will be able to hold vast numbers of items rather than the standard 15 items per channel. This extension provides a good example of why simply storing the RSS directly in Zope would be limiting—we would not be able to easily archive channel items.
Working with Channels
Installing the RSS Channel ProductIf you want to follow along as we exercise and explore RSS channels in Zope, you'll need to download and install the RSS Channel Product and XML Document version 1.0a5 or later.
First download the products. Place the "tarballs" (.tgz files) in your Zope directory. Then ungzip and untar the files. On Windows, WinZip should be able to handle this for you.
Now restart Zope, and you should be able to use RSS Channels.
Creating a Channel To create a channel, choose RSS Channel
from the product add list. Then
specify SlashDot
as the Id
and
http://www.slashdot.org/slashdot.rdf
as the URL
. Then click
Add
.
There should be a short pause during which Zope fetches the RSS and constructs the channel and its items.
If all went well, you should be returned to the Zope management screen. To examine
your
channel, click on the new channel object named SlashDot
. Then click the
View
tab to see an HTML representation of the channel.
So how did the channel get built? The fundamental step in constructing a channel is parsing the RSS data.
Zope ships with Expat, a well respected XML parser. We could use Expat directly to build the channel, but in our implementation we have chosen to use a higher level interface—XML Document. XML Document parses XML and builds a DOM tree of Zope objects. Here's a method of the channel class that uses XML Document to parse RSS:
def update(self, REQUEST=None): """ Fetch the RSS content from a remote URL and update the channel. """ # retrieve RSS file f=urllib.urlopen(self.url) # parse RSS file to DOM using XML Document d=Products.XMLDocument.XMLDocument.Document() d.parse(f) # get channel attributes c=d.getElementsByTagName("channel")[0] self.title=c.text_content("title") self.link=c.text_content("link") self.description=c.text_content("description") # get channel items items=d.getElementsByTagName("item") for item in items: # get information about the item title=item.text_content("title") link=item.text_content("link") description=item.text_content("description") # add the item self.addItem(title, link, description) # remember when the channel was last fetched self.update_last=DateTime() # return a management screen if called from the web if REQUEST is not None: return self.statusForm(self, REQUEST)
Let's look how this method works. First, it retrieves the remote RSS file using Python's
standard urllib
module. With the XML in hand, it next creates a DOM tree using
XML Document. Using the DOM getElementsByTagName()
method, it locates the
channel
element in the DOM and queries its subelements to set the channel
object's attributes. Next, the method examines all the item
elements in turn.
It collects some information about each item and then calls another method to create
an item
instance inside our channel object.
We then do some house-keeping to keep track of when we last fetched the RSS file. This allows us to only fetch the remote RSS file when we need to. Finally the method returns a Zope management screen if necessary.
Displaying HTML and XML
To a channel object, the tasks of displaying itself in HTML and XML are very similar. Both views of the channel are created by methods on the channel. These display methods use Zope's template reporting language, DTML.
Part of the Zope philosophy is that objects should be able to represent themselves in multiple ways. By creating new templates, you can extend the formats in which your channels can be displayed.
You can examine the template files channelHTML.dtml
and
channelRSS.dtml
in the lib/python/Products/RSSChannel
directory
to see how they work.
Using Channels in Zope
Now that we have Zope channel objects, what can we do with them besides simply look at them in HTML and RSS format? You can index and search them, manage them over the Web, protect them with a comprehensive security system, integrate them with relational databases, and integrate them with network services such as FTP, WebDAV, XML-RPC, and more.
To see some examples, import the sample file that comes with the RSSChannel.tgz
distribution. Click on the Import/Export
from the Zope management screen and
type RSSSample.zexp
in the Import file name
field and click
Import
.
Now you should have a RSSSample
folder. Take a look inside it. You'll find a
couple of Channel objects, a ZCatalog that indexes the channels, and some DTML Methods
that
exercise the channels. To try things out, click on the View
tab. Experiment
with some of the options and look at the DTML methods to see how they perform their
actions.
The RSSSample
folder includes examples that demonstrate how to use Zope's
ZCatalog to search channels. The indexChannels
DTML method takes care of
registering the channels and their items with the ZCatalog. The searchResults
DTML method performs the ZCatalog search. The searchAsRSS
DTML method shows how
to represent a ZCatalog search as an RSS channel.
Where To Go From Here
This article only begins to demonstrate the potential of Zope and XML.
In the case of our RSS channel example, you might want to build an application in which different users were given access to different channels depending on their credentials. You could create new channels that are composed of searches of many other channels. You could convert RSS channels to another format such as email, and mail users items at regular intervals given their preferences. You could use Cybercash to charge users or other sites for retrieving information via RSS. You could enable remote channel management over XML-RPC.
Conclusion
Real world XML application development requires more than just the ability to retrieve and store XML. Zope provides a host of resources that can be useful for turning XML data into a web application. Zope gives you searching, security, persistence, over-the-web management, support for many network protocols, rapid application development, and more. Add to this the ability to read and write XML over the network, and you have a good environment for XML application development.