Adobe's InDesign and XML
August 4, 2004
The process of formatting and typesetting documents has come a long way in the relatively short span of time that modern computers have been around; the process has typically revolved around formatting scientific or technical documents using a variety of command-line tools. However, the latest version of Adobe's page-layout application, InDesign, integrates XML files into its visually oriented publishing workflow.
Old-School Document Generation
One of UNIX's first killer applications was document typesetting. While making documents is no longer the sole reason to invest money in a box that flips bits as fast as possible, the purpose behind it remains just as relevant today as it did 30 years ago when troff first hit the scene: to offload as much work as possible to the machine, thereby allowing the author(s) to concentrate on the content of the document, rather than its appearance.
troff
typesetting is achieved by feeding a specially formatted document to a
series of preprocessors, repeatedly piping the output of one preprocessor to the input
of
the next. This preprocessing will continue until the content of the original document
has
been suitably massaged and can parsed by the troff utility, which in turn produces
a PostScript copy of the document.
New-School Document Generation In a Nutshell
A New Tool Is Born
Fast-forward 20 years from the birth of UNIX, when HTML is being sent across the Internet at an exploding rate. Realizing that they might be on to something, the W3C simplifies and standardizes HTML's ancestor, SGML, to become XML. Serving as a general-purpose data-exchange format, XML syntaxes are created for countless applications, including vector graphics (SVG), remote procedure (XML-RPC) invocation, and document formatting (XSL-FO).
While there are many simple XML applications, XSL-FO definitely doesn't fit this classification. This is simply due to the complex nature of controlling the layout for a multiple-page document; headers and footers, margins, all sorts of font properties (its family, size, weight, leading, kerning, alignment, and countless other details) must be taken into account to transform a marked-up document into one that can be professionally printed and bound.
To avoid interacting with this level of complexity, authors typically mark up their
content with another syntax (such as DocBook), which is then transformed with XSLT into another syntax for
presentation -- creating a pipeline of XML tags that are passed
between processors, (similar to troff
's pipeline). And while XSL-FO is
well-suited for documents such as theses, manuals, and other technical references
--
documents that typically have a simple layout with a restricted set of fonts, colors,
and
external resources -- it is virtually impossible to format anything that doesn't fit
within
a rectangular box.
However, this "box" limitation isn't an oversight; the W3C never intended the XSL-FO specification to fill any other shoes. After all, anyone who is in need of a customized XML workflow to suit their needs is free to create it, which is exactly what Adobe has done with the latest release of its Creative Suite software.
In Steps InDesign
InDesign, a component of Adobe's Creative Suite, is a page-layout program aimed at
Adobe's
core audience: professional graphic and media designers. Whereas XML and troff
pipelines are
primarily used to create technical documents, InDesign is primarily used to create
visual documents such as brochures, advertisements, and other media (there are exceptions,
of course).
In this brief tutorial, we'll walk through a simple example of how InDesign CS allows a publishing workflow to be broken into two pieces: developing the document's appearance and then its content.
Styles
Applications that interact with documents (whether it be InDesign, an XSL-FO processor, web browser, or even Microsoft Word) use the concept of styles, which allow any number of components of a document to have their appearance governed by a style declaration. Typically a style declaration controls an object's appearance: color, spacing, and other properties that are dependent on the application's domain.
XSL-FO styles
are created using XML attributes, many
of which are analogous to CSS-style declarations (such as
font-family
and border-left-width
). InDesign allows authors to
define styles in a manner that is familiar to anyone who has worked with similar design
programs; the following image shows how basic character properties of the
"employeeInfo" style are defined:
Figure 1: A screenshot illustrating how paragraph styles are defined through InDesign's interface. |
Adding XML to the Mix
Among other new features included in this version of the application, InDesign CS introduces an XML compatibility layer that allows it to import and export XML files that encapsulate the information contained within the document.
In the previous version of InDesign, the written material of our documents was tied up in the text blocks that were defined in the document itself (indeed, this is the case for most files that are manipulated with a proprietary application); similarly, the only way to manipulate other resources used within the document, such as images, was through the application's interface. This is no longer the case with InDesign CS.
InDesign's XML Savvy
In much the same way that CSS provides a way to abstract the appearance of an XHTML document, InDesign CS' XML-compatible layer provides a powerful way to separate the content and style of print documents that, in the past, have typically had their content and presentation tied up in a knot.
Essentially, this layer allows you to define the template of your document from within InDesign, and to develop the content of the document in any text editor. Once the template and content have been finalized, it is simply a matter of merging the two together, and exporting the final work to the format of your choice.
A Simple Scenario
To see how InDesign's XML layer can ease a publishing workflow, let's walk through a simple scenario where a small company needs business cards printed for each of its employees. While the cards will share some information (such as the company's name), each card will have unique information for the rest of the fields (such as the employee's name, email address, etc.).
Traditionally, this process would be completed in one of two ways, each with its own benefits and drawbacks:
- By creating one InDesign file for each employee and copying and pasting the common information between the documents while keeping the employee's unique information separate, or...
- By creating one InDesign file for the entire company, and merely changing the unique information for each employee by cutting and pasting before we export the document to a printable format.
However, the two methods outlined above also share the same problem: if, at any point after exporting the file, we make a change to the template, then the change must be reflected in every file that uses the template. InDesign alleviates this problem by allowing us to define a template and an XML file to provide the necessary information for a document. Thus, the XML file acts as a kind of simple data store, allowing the data to be created, manipulated, or used with other applications.
A Simple How-To
The first step in our workflow is to define the template we'll be using for our business cards. In order to do this, we'll have to:
- Set up the layout of the business card, thereby creating the static content that will be shared among all of the cards.
- Specify the text blocks (and optionally any images) that will vary from one card to the next (in our example this will consist of the employee's name, address, email, and web site).
- Define the styles that will be applied to control their appearance.
This process can be seen in Figure 2; the employee's email address, email, and phone number have been designated as variables using the InDesign Tags palette, and are mapped to the "employeeInfo" style (defined in Figure 1), while the "employeeName" variable is mapped to a style of the same name.
Figure 2: Defining how the elements of our XML file will appear using the styles declared in our InDesign document. |
As you can also see in the figure above, the four variables for each of our business cards are represented as sibling elements under a common parent element named Root. Following the rule of least surprise, the XML document that we will import must follow the same structure. The following code is a listing of the XML documents that represent two of our company's employees:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Root> <!-- contents of dave.xml --> <employeeName>davidfmiller</employeeName> <employeeEmail>davidfmiller@gmail.com</employeeEmail> <employeeAddress>1234 main street calgary, ab, ca a1b 2c3</employeeAddress> <employeePhone>(403) 555-5555</employeePhone>
</Root> <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Root> <!-- contents of dylan.xml --> <employeeName>dylan mckay</employeeName> <employeeEmail>joey@fivevoltlogic.com</employeeEmail> <employeeAddress>1234 main street calgary, ab, ca a1b 2c3</employeeAddress> <employeePhone>(403) 555-4444</employeePhone> </Root>
Importing our XML file is simply a matter of bringing up a contextual menu and locating the appropriate file on disk.
Figure 3: Importing an XML file to populate the placeholders with an employee's information. |
After importing each XML file, the placeholders of our InDesign template will be populated with the corresponding data from the imported file. Thus, exporting a print-ready business card for all of the company's employees is only a few mouse clicks away.
Figure 4: The final series of business cards. |
Closing Tag
Adobe has applied the template concepts used in other XML technologies to InDesign's visual environment (and made them accessible to designers in the process), allowing InDesign to cooperate with applications on a level that was previously impossible.
And while this tutorial provided a very brief glimpse into InDesign's XML capabilities, it is by no means a comprehensive resource; interested readers can find more information from Adobe.