The Atom Link Model
June 16, 2004
Atom is an emerging XML vocabulary and protocol for syndication and editing. Atom
has a
coherent linking model to express a number of different types of links. Atom borrows
heavily
from the <link>
element in HTML, although they are not identical. This
article explores several of the most common link types that are already deployed in
Atom
feeds today.
Every article need a permanent home
A central concept of Atom is the concept of the alternate link, sometimes called the
"permanent link" or "permalink". Every Atom feed, and every entry within every feed,
must
have an alternate link that points to the permanent location of that feed or entry.
The
terminology of calling it an "alternate" link is borrowed from the HTML
<link>
element, the specification of which states that an
alternate link "designates a substitute version for the document in which the link
occurs".
At the feed level, the alternate link points to the home page of the site that the feed is syndicating. At the entry level, the alternate link points to the "permalink" of that entry in some other format (most often HTML, although it can be any content type).
A feed alternate link points to the site's home page, and it looks like this:
<feed version="0.3" xmlns="http://purl.org/atom/ns#"> <title>XML.com</title> <tagline>XML from the inside out</tagline> <link rel="alternate" type="text/html" href="http://www.xml.com/"/> ...
An entry alternate link points to the entry's permalink, and it looks like this:
<entry> <title>The Courtship of Atom</title> <summary>The Atom syndication specification may move to a new home at the W3C.</summary> <link rel="alternate" type="text/html" href="http://www.xml.com/pub/a/2004/05/19/deviant.html"/> ...
Every Atom feed and every Atom entry needs an alternate link. It is the only type of link that is required by the Atom feed specification.
Linking to related articles
Many articles, including the ones that appear on XML.com, link to other, related articles
for further reading. Some content management systems do this automatically through
keyword
matching or other metadata tagging; other systems allow you to specify related articles
manually; others only allow you to include a further reading list as part of the main
article text, marked up in HTML like everything else. Regardless of how it gets there,
it's
a common use case, and Atom feeds have a special link tag just for related articles:
<link rel="related">
.
To distinguish these related links, you can use the optional title
attribute,
as shown here:
<entry> <title>WWW2004 Semantic Web Roundup</title> <summary>Reporting from the WWW 2004 conference, Paul Ford surveys the state of the art in client and server side semantic web technology. </summary> <link rel="alternate" type="text/html" href="http://www.xml.com/pub/a/2004/05/26/www2004.html"/> <link rel="related" type="text/html" href="http://www.w3.org/2004/03/w3c-track04.html" title="WWW2004 W3C Track schedule"/> <link rel="related" type="text/html" href="http://www.w3.org/2004/04/13-swdd/" title="The Developer's Day schedule"/> ...
Paul Ford had nine related links listed after that article. I'm only showing two here, but each entry in an Atom feed may have as many related links as you like.
Linkblogs
The other major use of <link rel="related">
is for linkblogs. Many
popular weblogs have a section of main content and then a "linkblog" on the side that
links
to external articles. Other sites are nothing but a structured list of links; such
sites may
be autogenerated by IRC bots (example: wearehugh.com),
or they may provide their own API just for posting links (example: del.icio.us).
The same concept that we used to provide a list of related links after an article
can also
be used to express a link-centric feed in Atom. First, don't forget that every entry
needs a
permalink (<link rel="alternate">
) that points to the permanent home of
this entry. You should not use <link rel="alternate">
to point
to the external article; it should point to your own archives. Use <link
rel="related">
to point to the external article, as shown in this real example
from my own linkblog:
<entry> <title>Setting up an iTunes server in FreeBSD</title> <link rel="alternate" type="text/html" href="http://diveintomark.org/archives/blinks/2004/05/#b20040527034848"/> <link rel="related" type="text/html" href="http://home.introweb.nl/~dodger/itunesserver.html"/> ...
Of course you can have more than one related link per entry, so you can do a kind of "link cluster", where multiple related links are grouped together in the same entry. This entry linked to an article about setting up an iTunes server. iTunes uses the Rendezvous protocol for discovery and presence, so let's add a related link that gives some background information about Rendezvous:
<entry> <title>Setting up an iTunes server in FreeBSD</title> <link rel="alternate" type="text/html" href="http://diveintomark.org/archives/blinks/2004/05/#b20040527034848"/> <link rel="related" type="text/html" href="http://home.introweb.nl/~dodger/itunesserver.html"/> <link rel="related" type="text/html" href="http://developer.apple.com/macosx/rendezvous/faq.html" title="Apple Rendezvous FAQ"/> ...
The concept of "related" is entirely up to the publisher; it has no more or less semantic connotation than the word "related" has in English. Things can be loosely related, and that's okay.
As with related links after an article, each entry in an Atom linkblog can have as many related links as you like.
Giving credit to your sources
A few months ago, it was scientifically proven that bloggers kill kittens. No, actually it was shown that bloggers tend to republish links they read on other sites, without mentioning where they saw them. In response to this, more and more publishers are explicitly publishing "via" links in their linkblogs. A "via" link is simply a link back to the site where you found the article you're linking to.
Atom has a link tag for this scenario: <link rel="via">
. In the previous
example, I discovered the article about setting up an iTunes server on Jeffrey Veen's
site,
so I should give him some credit with a "via" link, as shown here:
<entry> <title>Setting up an iTunes server in FreeBSD</title> <link rel="alternate" type="text/html" href="http://diveintomark.org/archives/blinks/2004/05/#b20040527034848"/> <link rel="related" type="text/html" href="http://home.introweb.nl/~dodger/itunesserver.html"/> <link rel="related" type="text/html" href="http://developer.apple.com/macosx/rendezvous/faq.html" title="Apple Rendezvous FAQ"/> <link rel="via" type="text/html" href="http://www.veen.com/jeff/archives/000545.html" title="Jeffrey Veen"/> ...
Comment feeds
Many weblogs, community sites, and general purpose sites (including XML.com) allow
visitors
to post comments on individual articles. This intersects Atom in two related ways.
In Atom,
a comment is represented
like any other entry, and many publishers now generate comment feeds for individual
articles. To make these per-article comment feeds easier to find and subscribe to,
Atom has
a link tag to point to an entry's associated comment feed: <link
rel="comment">
.
Here is a real example from Sam Ruby's weblog. The entry in question is Aggregator UTF-16 tests, and the associated feed that contains all the comments on the entry is at Aggregator-utf-16-tests.atom. This is what the article looks like in his site's Atom feed:
<entry> <title>Aggregator UTF-16 tests</title> <link rel="alternate" type="text/html" href="http://www.intertwingly.net/blog/2004/06/ 03/Aggregator-utf-16-tests"/> <link rel="comments" type="application/atom+xml" href="http://www.intertwingly.net/blog/2004/06/ 03/Aggregator-utf-16-tests.atom"/> ...
The other way that publishers use Atom comment feeds is to publish a site-wide feed
that
contains all comments across all entries. I do this on my own site (site, comments feed). I use the same link
construct (<link rel="comments">
) at the feed level instead of the entry
level, to signify that this associated comments feed is a comments feed for the entire
site.
This is the relevant excerpt from my Atom feed:
<feed version="0.3" xmlns="http://purl.org/atom/ns#"> <title>dive into mark</title> <link rel="alternate" type="text/html" href="http://diveintomark.org/"/> <link rel="comments" type="application/atom+xml" href="http://diveintomark.org/xml/comments.xml/"/> ...
Catching up on missed news
Syndicated feeds are generally associated with keeping up with the most current news, but what if you're out of town for a few weeks? How do you catch up? Articles that have fallen off the end of the "most recent articles" feed are just gone, unless you manually visit each site and piece together the list of articles you missed.
A powerful but relatively unexplored use of Atom is the idea of publishing all of your past content as a series of Atom feeds. For example, on my site I have monthly archives that contain summaries and links to all my part articles. But I also have monthly archives of all my articles in a series of Atom feeds. For example, http://diveintomark.org/xml/2004/03/index.atom is the Atom archive for all of the articles I published in March 2004.
Why is this useful? Because it allows Atom-enabled aggregators to browse my archives
programmatically, without screen-scraping or guesswork. And the linchpin that holds
it all
together is a pair of link tags, again borrowed from HTML: rel="prev"
and
rel="next"
.
March's Atom archives at http://diveintomark.org/xml/2004/03/index.atom contain links to February and April, like this:
<feed version="0.3" xmlns="http://purl.org/atom/ns#"> <title>March 2004 [dive into mark]</title> <link rel="alternate" type="text/html" href="http://diveintomark.org/2004/03/"/> <link rel="next" type="application/atom+xml" href="http://diveintomark.org/xml/2004/02/index.atom"/> title="February 2004 archives"/> <link rel="prev" type="application/atom+xml" href="http://diveintomark.org/xml/2004/04/index.atom"/> title="April 2004 archives"/> ...
And the main Atom feed for the site links to the latest Atom archive, like this:
<feed version="0.3" xmlns="http://purl.org/atom/ns#"> <title>dive into mark</title> <link rel="alternate" type="text/html" href="http://diveintomark.org/"/> <link rel="next" type="application/atom+xml" href="http://diveintomark.org/xml/2004/05/index.atom" title="May 2004 archives"/> ...
Atom-enabled aggregators that are subscribed to the main Atom feed can render a link to the May 2004 archives and allow you to browse the archives to catch up on articles you missed while you were away from your aggregator.
Integration with the Atom API
Finally, a compelling feature of the Atom feed format is its integration with the Atom API, which is explained and implemented in Joe Gregorio's recent XML.com article. An Atom feed can contain link tags that tell Atom-enabled clients how to post a new article to the site, edit an existing article, and post comments to an article.
These are deployed today on several popular weblogging systems, including Typepad and Blogger, and it will also be part of the standalone software Movable Type 3.0.
At the feed level, you can include a <link rel="service.post"
type="application/atom+xml">
, which points to the Atom API endpoint for posting
new articles to the site. Only the rel
, type
, and
href
are required; the title
is optional. This is a real example
from the Atom feed of Mena Trott, co-developer of Typepad:
<feed version="0.3" xmlns="http://purl.org/atom/ns#"> <title>Not a Dollarshort</title> <link rel="alternate" type="text/html" href="http://mena.typepad.com/"/> <link rel="service.post" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=9" title="Not A Dollarshort"/> ...
At the entry level, you can use a similar link construct to include an "edit this
post"
link. Use <link rel="service.edit" type="application/atom+xml">
, then
point to the Atom API EditURI for this point. Here is another real example from Mena's
Atom
feed:
<entry> <title>Road to Hana</title> <link rel="alternate" type="text/html" href="http://mena.typepad.com/dollarshort/2004/04/road_to_hana.html"/> <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=9/entry_id=1263283" title="Road to Hana"/> ...
Of course, most systems will probably require authentication before adding articles or editing other articles; see my previous article Atom Authentication for details on the authentication scheme supported today by Blogger, Typepad, and Movable Type 3.0. But not all systems require authentication; an Atom-powered wiki might include "edit this page" links that required no authentication at all.
More Dive Into XML Columns |
|
One final use of the Atom API is for posting comments. If an Atom-enabled publishing
system
supports comments and an article has comments enabled, you can add a link to allow
your
readers to post comments via Atom-enabled aggregators. The link tag is <link
rel="service.post">
-- the same construct we used on the feed level to advertise
the ability to post a new article to the site, but here we will use it at the entry
level to
advertise the ability to post a new comment on the entry.
This is a real example taken from Sam Ruby's site, from the same feed we saw earlier when exploring comment feeds. Sam not only provides site-wide and per-entry comment feeds, he also allows Atom-enabled clients to post comments to entries via the Atom API. Here is the relevant excerpt from his site's main Atom feed:
<entry> <title>Aggregator utf-16 tests</title> <link rel="alternate" type="text/html" href="http://www.intertwingly.net/blog/2004/06/03/Aggregator-utf-16-tests"/> <link rel="service.post" type="text/html" href="http://www.intertwingly.net/blog/1801.atomapi" title="Add your comment"/> ...
As with posting new articles, the server may require authentication when you try to post a comment. The decision is entirely up to the publisher, which makes its requirements known in HTTP headers when you request the Atom API endpoint (as explained in Atom Authentication). Sam's site does not require authentication for comments.
Further reading
These are the links that, in an Atom feed, would be expressed as <link
rel="related">
.