re: mime formats and versions in format specifications

I think I may have not made my proposal clear: take whatever it is
that the current proposal labels "GIF" and just label it "GIF;89a" or,
if you must, "GIF89a". Certainly, if you introduce a new,
not-backward-compatible format, say "GIF92b" that cannot be read by
GIF89a readers, you should give it a different name.

My only problem is releasing a protocol where there is a name "GIF"
when what you clearly intend to denote by it is "GIF as commonly used
by 1992".

Mark Crispin writes:

The problem I have with version numbering is that it creates state that every
implementation must know about in order to support.  Fundamentally, versions
are used to represent two different entities:
 . mostly-compatible differences (= a clue that some minor extension may be
   present)
 . incompatible differences (= don't even try feeding it to an older reader)

Suppose I have a "GIF" reader that knows about version 1.  I get something
that is called "GIF version 2".  I do not know if it is reasonable to feed it
to my version 1 reader and hope for the best.


I am in favor of outlawing 1 (don't send 'minor variations' around if
they aren't to spec) and marking 2 clearly (if it is incompatible,
give it a different name).

What is worse, the presence of a version numbering mechanism virtually begs
for a proliferation of versions for extremely minor changes/extensions.  This
makes things almost totally open-ended!


Oh, no! Versions have to be registered with the IANA too! Certainly we
can't have proliferation of versions any more than we can have
proliferation of formats!

I would support a versioning system -- essentially a ;VERSION=n attribute --
if and only if it was limited to the first type of versioning.  That is, to
differences that an unaware reader can reasonably ignore and hope for the best
in displaying it.


I think that's pretty useless -- why allow it at all?

Anything that is essentially incompatible -- a "don't even try if you don't
know what this version is all about" sort of thing -- should be done by means
of a new subtype name.  If "GIF version 2" can not be processed by a GIF
reader then it is no longer GIF, but rather is GIF2.


I agree, except for the choice of the initial name!  By calling it
"GIF" instead of its rightful name "GIF89a", we're adding some kind of
presumption that GIF won't change. As we've seen, however, it has
already changed once within the last few years and will likely change
again.

This has as a side effect the discouragement of a proliferation of
incompatible versions of a type, since it requires the creation of a new type
in each case.  I consider that a feature.


I agree that it is a feature, and that the constraints you believe are
applied to formats should be applied to versions as well.

Keith Moore writes:

Your point is well taken, and we would do well to put version numbers in
content-type headers.  However I don't want to delay MIME going to proposed
standard for this reason, especially if we have to define a mechanism for
defining content-type versions before we can proceed.  When MIME goes up for
draft standard status we can fix this and other problems that will
undoubtedly crop up.  For instance, I think we could safely add a "version="
parameter to all body part types at a later date without breaking existing
MIME readers.


You already have a mechanism, if you think that after "ps" there will
be "ps2" and after "gif" there might be "gif92", why not just call it
"ps1" and "gif89a" to start with?

I also think you are correct that the PostScript spec as currently in MIME
is not sufficient to ensure interoperability.  (other concerns have to
do with how ends of lines are represented (since we haven't defined an
end-of-line convention for application bodyparts) and line lengths (since
some systems can't store very long lines).  However, it is probably good
enough for now.  Some of these things can't be determined without some
operational experience.  That's what proposed standard status is for.


My concern is that the problems that versionless formats entail are
not immediately evident with "operational experience": the problem is
a lack of compatibility OVER TIME. Certainly the initial experience
with mime is that the writers and readers will be more or less in
sync, since they all started at the same time. The problem -- that
formats tend to get out of sync over time -- will only show itself
years later. This kind of 'time-sensitive compatibility' issue is not
uncovered by testing; you actually have to plan for it.

Ned Freed, Postmaster, writes:

Let's talk about PostScript first. PostScript has documented, standardized,
and commonly used facilities for identifying the version of PostScript in
use directly in the PostScript data itself. Moreover, this information does
not map cleanly to a single number or group of parameters. It is complex
information that cannot distilled in the manner you specify. Moreover, if you
attempt to do this the result will not be useful -- it only introduces a
number of silly states where the parameter claims one version but the
document claims another.


I don't believe this is true. At least not the "commonly used" part,
if the kinds of postscript files that are widely found available for
anonymous FTP are to be believed. Even the RFCs don't follow the
convention. For example, the RFC1125.PS doesn't include a version
reference (it starts with %! only); RFC1119.PS says it is
"%!PS-Adobe-1.0", but RFC1144.PS starts with "%!PS-Adobe-2.0",
although it doesn't seem to require any postscript level 2 operations.

Thus, this proposal does not make sense for PostScript. It has in fact been
proposed before and rejected for the reasons given.


I did read the archives for the ietf-822 mailing list looking for a
discussion of the issue, but must have missed it. If the only reasons
given at the time were that "Postscript has documented, standardized
and commonly used facilities for identifying the version of Postscript
in use", I don't think that's adequate.

Now, on to GIF. GIF also has an internal identifier. It is even stronger than
PostScript in one sense -- the identifier is a REQUIRED part of the data. 
There
are presently only two values it can take; I suppose new ones might appear
in the future.


One of the things the content-type field is going to be used for in
gateway products is deciding whether there exists a converter from the
specified format to another format in use on the other side of the
gateway. I hadn't expected gateways to also have to unpack the body in
order to read the header information in order to decide whether the
installed conversion routine is adequate, or whether some other kind
of encapsulation in the gateway is necessary. This reason alone would
dictate not relying on the content of the GIF to tell you that you had
a GIF89a and not a GIF93b, for which you had no converter.

Once again, reproducing information, especially information that is a 
mandatory
part of the format, in the header is not clearly desireable. It would be
one thing if the information was separable and representable like it is
with some formats. But these things don't apply to these data types.


I agree that unnecessary duplication of version information should be
avoided; I'm claiming that in many cases it isn't duplication, and
that in several cases it is necessary. It also isn't very expensive. I
believe the versioning information I'm asking for is quite minimal
(don't create a strawman and then knock it down). All I'd really like
is to say "ps-1" or "ps;1" or whatever you want the syntax to be
instead of just "ps", and "gif89a" or "gif;89a" instead of just "gif".
That seems to apply to the formats given.

I won't repeat your argument about TIFF since it is moot, except to
add that while NETFAX specifies a profile of TIFF, it isn't all of
tiff in all of its possible glory; even so, I'd want "netfax" to have
a version number associated with it, too, if it is used as a MIME
content-type.

Marshall Rose writes:

I think at this point, the best thing to do is to field implementations of the
existing MIME specification.  This will give us the clearest indication
as to how well MIME provides for the interoperable exchange of
multi-media messages.


See above: interoperability of field implementations doesn't prevent
long-term version compatibilty problems, since they all start at
relatively the same point in time.


Ned Freed writes (of my argument that the postscript specification
doesn't ensure interoperability):

Of course it is not enough to insure interoperability. This is NOT a 
requirement for a type to be useful.


You'll have to forgive me for boggling at this. I thought the whole
point of having registered content-types was to insure
interoperability.

Ned Freed writes (of my complaint that the [POSTSCRIPT] reference book
does not constrain fonts):

Again, this is axiomatic. It also applies to many useful content types besides
PostScript. Practically any format complex enough to be useful is going to 
have
interoperability problems. This applies to every markup language I've ever
heard of, every revisable document format I know of, and certainly to every
final form lanauge (like PostScript) in existence.


I don't believe this is true of the other content types, for rich
text, or of netfax, for that matter. I do think it is a problem for
Postscript; I believe it is quite possible to constrain PostScript to
give far more assurance of interoperability than has been done in the
current MIME draft.

However, this does not mean you cannot engage in useful interchange of
PostScript documents. A vast majority of PostScript applications do work over 
a
wide variety of environments and situations. This is because they are designed
to work in this way. There are also good and valid uses for PostScript that
does not conform to any widely accepted standards for interchange.

Moreover, PostScript contains facilities for documenting, precisely and
exactly, what facilities and resources a given document uses. This information
can be used to determine whether or not a document has a chance of working in 
a
given environment. More to the point, it can be used to determine _why_ a
document fails to work. This is something that few, if any, current PostScript
viewer implementations take advantage of. (In my environment at least, most
PostScript I get, and I get a hell of a lot of it, works fine with my viewers,
so I don't have to address this problem very often.)


It was my understanding that the purpose of the IETF-822 extensions
working group was to raise the "least common denominator" for exchange
of electronic mail. From this principle, it should follow that any
definition of "postscript" should be a "least common denominator"
definition. That you have at your disposal, a fine PostScript viewer
available and are able to deal with a wide variety of different
postscript forms might prejudice you to believe that your environment
represents the least common denominator. That you are able to debug
postscript files that contain operators your interpreter can't deal
with doesn't mean that we should require every recipient of mail
messages marked as "ps" to do so.

I think this is the crux of the problem/disagreement.

As for your assertion that PostScript contains facilities for
documenting the resources a given document used, I take the current
RFCS available in postscript form as a good example -- NONE of them
seem to contain any font resource definitions.

My personal opinion is that an "interoperable" definition of PostScript is
probably impossible to attain and most likely will be useless once you attain
it. Adobe might be able to do this, but they would be about the only ones
who could. And I don't think they'd be interested...


My understanding is that this is exactly what they're attempting with
the "Carousel" format definition, but I think that is another
discussion. I don't think I should respond to any of the 'vendor'
issues here, but just stick with the interoperability ones...

Re the proposal to require senders to avoid FILE operations:

My implementation of PostScript supports these operators but does them
safely. Why should I have to take them out of my implementation just because
your implementation elected to solve the problem by removing them completely?
Why should I deny users access to these facilities, which they do in fact
use and depend on currently?


I'm not proposing that you take those operators out of your postscript
previewer, only that you not send me or anyone else MIME mail with a
content-type of postscript-whatever that contains them. If you want to
call it "postscript-2.0-private", that's fine with me. Again, this is
'least common denominator' vs 'anything that smells like postscript'
kind of argument. I wouldn't push this point except that the MIME spec
for other content times have clearly come down on the side of
unambigous, well formulated and interoperable formats.

The removal of PostScript is a show stopper for me. I absolutely reject
any such proposal. I also fail to see any constructive points to your
arguments that would make me want to change any part of the PostScript
description as it presently is formulated.


The word "show stopper" seems to be a red flag in this forum, and I'd
recommend you not use it so casually. It tells me that this is not an
open forum, that your mind is made up and that no arguments will sway
you. I suppose, not being a party to the earlier discussions, that I
might misunderstand the ground rules for IETF working groups. Does
this mean that, because it is a "show stopper" for you, that the
proposal for removing "postscript" as a part of the MIME draft
initially and leaving it as a possible IANA-registered content-type is
unacceptable to the group?