I like to solve application problems using the Web (statelessness, hypermedia, self-descriptive representations, and uniform interfaces) to produce an asymptotically tight bound solution!

Tuesday, May 12, 2009

CMIS VIII: Futile to model Atom using XSD

The CMIS AtomPub bindings include a W3C schema for XML, aka XSD for Atom, in a file called atom.xsd. I wanted to warn potential users of that schema about the perils of validating against that schema.

First of all, Atom's syntax has been specified using RNG. There are programmatic Atom validators to ascertain whether an Atom document meets the requirements of RFC4287. There  is a very good reason to not use XSD for modeling Atom - its open extension model.

If you have done any application programming with Atom formats, you would know that Atom allows foreign mark-up to be used anywhere (pretty much). That means you can arbitrarily interleave the elements from Atom's namespace with those from a foreign namespace. XSD does not have any mechanism to allow this. For example, in the document below, the highlighted mark up could occur anywhere except after the entry close tag:

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:f="http://example.com"> <title>Example Feed</title>
<link href="http://example.org/"/>
<updated>2003-12-13T18:30:02Z</updated>
<author>
<name>John Doe</name>
</author>
<f:foreign>something<f:foreign>
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
<entry>
<title>Atom-Powered Robots Run Amok</title>
<link href="http://example.org/2003/12/13/atom03"/>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2003-12-13T18:30:02Z</updated>
<summary>Some text.</summary>
</entry>
</feed>
The only restriction on foreign markup is that it cannot occur directly under the feed element anywhere after the first entry element. I highlighted a problem in CMIS with this concept in a previous blog post that explains why hasMoreItems results in invalid Atom documents.

2 comments:

Albert said...

The XSDs are there to help with tooling, such as JAXB or JAX-WS.

In this case, the extension was added inappropriately as it should be before the atom:entry as noted.

This should be fixed.

Nikunj Mehta said...

@Albert -

What will end up happening as a result of the approach taken by CMIS to date is that naive server vendors will assume that content not matched by atom.xsd is invalid. This will reduce interoperability.

Warnings about the lack of dependability of the XSD should be placed in prominent places, just like the tobacco warnings are. You may enjoy it now, but pay later.

The short-term benefits of accessing an Atom object model from XML documents are not worth the long term costs in terms of quality issues, interoperability losses, and general user frustration. Smart vendors should refrain from using the atom.xsd and instead use a Relax NG (RNG) grammar-based tool to guide them in their Atom parsing and validation. RNG is used as an informative syntax in Atom and AtomPub specs.

Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License. All materials on this blog are either the original work of its owner or used with acknowledgement of the copyright owner. 

About Me

My Photo
I have been an avid student of the evolution of the Web and its application to business problems.

Blog Archive

Label Cloud