Transforming XML with PHP
June 18, 2003
This article compares two methods of transforming XML in PHP: PEAR's XML_Transformer package and the W3C XML transformation language XSLT. I will first describe the PEAR project and its philosophy, with a focus on its XML transformation techniques. I will then give a brief introduction to XSLT and the way to use it from PHP.
Introduction
PEAR's main goal is to become a repository for PHP extensions and libraries. Its members try to standardize the way developers write portable and re-usable code [MaiaAIP01].
PEAR offers a wide variety of packages ready to use by PHP developers. Most PEAR
packages
are subclasses of the standard base classes [MaiaADLP01]. One
of these packages is the XML_Transformer
. This package was created to help you
transform existing XML files with the help of PHP code.
XSLT stands for "Extensible Stylesheet Language Transformations" and is a W3C Recommendation. As most readers know, it is a powerful implementation of a transformation language for converting XML into either XML, HTML or simple text [Holman00].
While you need PEAR to use XML_Transformer
, XSL transformations can be
processed internally by PHP. PHP offers XSLT functionality at its core, making it
easy to
incorporate transformation features into existing code.
As you can see, both technologies can transform XML files. But which technology best fits the needs of a PHP developer? Let's take a closer look at each one to find out.
PEAR::XML_Transformer
XML_Transformer
lets you map PHP functionality to specified XML tags. It offers
many possibilities of mapping XML tags. You can map a specific tag, a complete XML
namespace, or only a specific tag within a given namespace. These methods will be
described
later.
The way XML_Transformer
implements this functionality can be explained
easily: it associates an opening tag to a specific PHP callback and a closing tag
to another
PHP callback. It's very similar to PHP's xml_parse()
function.
How it works
You start the transformation engine by creating an XML_Transformer
object.
The constructor accepts an array of parameters that will change the behavior of the
transformer. The most important ones are the case folding options and the recursive
operation option.
Case folding lets you change the case of XML element names and attributes. You can
set the
target case to either upper or lower case. This can be accomplished by setting the
caseFolding
option to true and setting the caseFoldingTo
option
to either CASE_LOWER
or CASE_UPPER
.
<?php require_once 'XML/Transformer.php'; $myTransformer = new XML_Transformer( array( 'caseFolding' => true, 'caseFoldingTo' => 'CASE_LOWER' ) ); ?>
XML Namespaces
XML_Transformer
XML namespace support is based on qualified name prefixes
rather than namespace URIs. This lack of URI support has to do with the underlying
XML
parser. expat. PHP support for XML parsing has been
available since version 3.0.6, whereas support for expat's namespace features have
only been
available since version 4.0.5.
What XML_Transformer
considers a namespace is simply a qualified name prefix:
the prefix that is sometimes used when addressing namespaces. Instead of transforming
documents written in the following way:
<?xml version="1.0"?> <mydoc> <start xmlns:myns="http://my/namespace"> <sometag /> </start> </mydoc>
you should feed XML_Transformer
documents like this:
<?xml version="1.0"?> <mydoc> <myns:start> <myns:sometag /> </myns:start> </mydoc>
The overloadNamespace()
method overloads an XML namespace prefix and binds
all its elements to a PHP object. The object must provide the startElement()
and endElement()
methods. If you specify the &MAIN
or
null
namespace prefix, XML_Transformer
maps XML elements that
don't belong to any namespace.
<?php require_once 'XML/Transformer.php'; class My_Transformer_Object { function startElement($element, $attributes) { // Your code here. } function endElement($element, $cdata) { // Your code here. } } $myTransformer = new XML_Transformer(); $myTransformerObject = new My_Transformer_Object(); $myTransformer->overloadNamespace( 'myprefix', &$myTransformerObject ); ?>
XML_Transformer
provides an easier way to map namespaces. It's called
XML_Transformer_Namespace, and lets you map the XML opening and ending tags to a
start_ELEMENTNAME($attributes)
and end_ELEMENTNAME($cdata)
,
where ELEMENTNAME
is substituted by the XML element name to be mapped.
<?php require_once 'XML/Transformer.php'; require_once 'XML/Transformer/Namespace.php'; class My_Transformer_Namespace extends Transformer_Namespace { function start_myelement($attributes) { // // This method is mapped to the // 'myelement' opening tag. // } function end_myelement($cdata) { // // This method is mapped to the // 'myelement' closing tag. // } } $myTransformer = new XML_Transformer(); $myTransformerNamespace = new My_Transformer_Namespace(); $myTransformer->overloadNamespace( 'myprefix', &$myTransformerNamespace ); ?>
These examples demonstrate the true power and versatility of the
XML_Transformer
package. You can manipulate XML files very easily using only
PHP code. Of course, you'll need a midlevel knowledge of PEAR if you want to develop
anything serious.
XSLT
XSLT is a stylesheet language that transforms XML documents by using a "transformation specification". This specification is a set of rules that match elements. These rules describe the output of each element, based on its contents [Ray01].
The major difference between XSLT and other transformation engines is that XSLT crawls through the XML tree applying rules recursively. This method increases the control you have over the the transformation process, as there's no need to track context.
PEAR also features the
XML_XSLT_Wrapper
, the goal of which is to provide an interface to XSL
transformations. It looks very promising, but it's still in alpha state, so I'll stick
to
PHP native support.
PHP now comes with a builtin XSLT
extension. This extension is based on Gingerall's Sablotron
engine and the expat XML parser. You can
check for this extension by issuing the phpinfo()
function if you plan to use
these features in your projects.
Using XSLT from within PHP
To start using XSLT directly from PHP, you will need an XSLT file and the XML document that you wish to transform.
<?php $xh = xslt_create(); $myResult = xslt_process( $xh, 'myContent.xml', 'myTransformation.xsl' ); xslt_free($xh);?>
xslt_process()
function accepts three more optional parameters: the result
container file name, the array of arguments to the XSLT processor, and the array of
parameters to the stylesheet. The following example illustrates these parameters by
assigning parameters to the stylesheet.
<?php $xh = xslt_create(); $args = array(); $params = array('foo' => 'bar'); $myResult = xslt_process( $xh, 'myContent.xml', 'myTransformation.xsl', null, $args, $params ); xslt_free($xh);?>
XSLT is very easy to use from within PHP. All processing code is inside your XSLT file. You can also transform dynamic XML content without the need to read it from an external file. The PHP manual offers a more detailed explanation on the use of this and other features.
XSLT's transformation capacities rely on an external language. To maintain a large project's transformations you'll need to keep numerous external files. The advantage is that these files can be manipulated by a non-programmer.
Conclusion
While PEAR::XML_Transformer
gives you greater flexibility through the use of
PHP, XSLT is easier to use by non-programmers. XML_Transformer
's approach lets
you associate an XML element's opening and closing tags with specific functions. XSLT's
transformation is tightly coupled with the XML tree.
If you plan to build your own set of namespaces and associated PHP libraries, then
I think
XML_Transformer
is the way to go. If you want to give other people the
ability to create custom transformations, then I recommend XSLT.
Bibliography
|