I like to solve application problems using the Web (statelessness, hypermedia, self-descriptive representations, and uniform interfaces) to produce an asymptotically tight bound solution!

Monday, July 13, 2009

Embed XML document inside another XML document

Ever considered the implications of embedding an XML document inside another XML document? Every new format developer, especially extension format developer has to consider the consequences of embedding an XML extension payload inside an XML envelope.

In plain terms, what are the consequences of doing the following:

<feed xmlns="http://www.w3.org/2005/Atom">
<id>...</id>
...
<entry>
<id>...</id>
...
</entry>
...
</feed>

While we do it all the time with Atom, this is an example of a document embedded inside another - Atom entry inside Atom feed. There are a number of issues lurking under the surface that one has to consider before copying the entry bits in to the feed itself. In general, there are many issue to consider while embedding one XML document inside another.

These issues don't surface if you are managing all the bits yourself, i.e., you store the original bits of everything and normalize to a common set of encodings, namespaces, signatures, entities, and so on. However, if you are aggregating content from different publishers, then it's a different ball game. I started thinking about these issues as a result of Mark Nottingham's feedback about the Atom in-line I-D.

Here are some of the issues I could think about (and manifestations in the case of Atom):
  1. Duplication of content (e.g., <atom:title>)
  2. Inherited values for metadata (e.g., xml:lang, xml:base)
  3. Character set and encoding (for XML and binary content)
  4. Entity declarations
  5. Namespace declarations
  6. Digital signatures
  7. Protocol metadata (ETag, Content-Length, Content-Type)
Did I miss some more? If so, please help me identify them. It will help in making the in-line I-D stronger and easier to implement and use.

0 comments:

Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License. All materials on this blog are either the original work of its owner or used with acknowledgement of the copyright owner. 

About Me

My Photo
I have been an avid student of the evolution of the Web and its application to business problems.

Blog Archive

Label Cloud