ietf
[Top] [All Lists]

Re: Registration of media type application/calendar+xml

2010-09-13 16:13:56
[ietf-types removed to spare those who just want to read type applications.]

We need to distinguish between alternate syntactic forms, versus alternate
semantic environments.  Translating between versions of the former do not 
need
to lose information.  Translating between versions of the latter almost
certainly do.  Losing information is about differences in semantics.

I'm not convinced that there is a difference in practice between the two. 
Different syntactic forms tend to get used in different environments, and to
adapt to the requirements/conventions of those environments.

That has not been my experience. My experience has been that when it comes to
interoperability in both the short and long term, clear, accurate and concise
specifications of the formats and the mappings between them are key.

calendar+xml is intended to be merely a syntactic alternative.  I just don't
believe it is likely to stay that way.

And I believe it will if it is done properly, which in the absence of any
actual evidence is an equally valid statement. Again, the problem with this
entire discussion is that it's mostly happening at the 20,000 foot level,
leaving the details of the protocol - which really matter - behind.

As I understand the calendar+xml, it is "merely" a syntactic alternative. 
To the extent that it requires information loss when being re-encoded, yes
that should be fixed.  But it's not likely to be difficult and the
existence of two syntactic forms is not inherently problematic.  (We have
lots of examples on the net of doing this quite nicely, at different
layers of Internet architecture.)

please cite specific examples.  it might be instructive to see why they work
or don't work well.

Fair enugh, but there are vast numbers of examples of alternate syntactic forms
out there - so many that I'm sure you can find an example to support almost any
assertion you want to make.

So let's narrow things down to formally standardized data formats where
semantic differences were disallowed. I've already discussed Sieve as an
example in this context, but another, older one would be CGM (computer graphics
metafile).

This is actually an instructive case between there were three syntaxes for CGM,
not two: text, binary, and compressed text. (More recently, something called
WebCGM has been specified, but this happened long after my involvement ceased
and I know nothing about it.) Both the text and binary formats were quite
straightforward and clearly specified. Back in the day we had multiple
applications that happily generated either the text or binary format, and at
least one which could generate both. Applications able to process CGM generally
only supported one variant, but there were at least two readily available
software packages that could convert between text and binary. 

But what about the compressed text format? It turned out to be quite
problematic. Part of the problem was obvious: Three formats is pretty clearly
at least one too many. It's enough of a pain to support two formats; three was
seen by implementers as ... excessive. Even worse, the compressed text format
was much more complex, and IMO not terribly well specified. Usage of it was
comparitively rare, and the software that did try and handle it tended to have
gotten minimal testing and was quite buggy. As I recall, we had one application
that only supported compressed text output, and we had a devil of a time
finding something that could process it without crashing. Even worse, things
would work fine until something new was tried, and then more problems would
emerge.

So what are the lessons here? Well, pretty much the same old same old: Format
simplicity and specification clarity are the key to successful
interoperability, far more so than having a single format. In the case of CGM,
converting wasn't the problem, use of the compressed text format in any context
was. Things would almot certainly have been worse had compressed text been the
only format available.

the best one that immediately comes to mind is raw IP vs. PPP with header
compression.  I think it works because the latter representation is only used
on the wire between two endpoints of a network link.  And if it fails to
faithfully reproduce the packet, it's very clear where the problem is. 
(there's no argument about which representation is correct - the original
packet is always correct.).

I don't think the comparison is particularly useful because the domains are so
different, but even so, I think the fact that the mapping is fairly simple and
clearly specified goes a long way to making these protocols interoperate as
well as they do.

                                Ned
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf