[Top] [All Lists]

Re: Restrictions on the Content-Type "message"

1999-01-04 11:20:44
I don't think it affects e-mail standards at all, since message/rfc822
does not have this problem (at least, not with e-mail as currently
specified by DRUMS - they are going to run into this problem if/when
they decide to allow UTF-8 in headers at some future date).

IMHO RFC-2045 is too restrictive. It should have allowed each
message/type to decide this issue for itself. In fact, RFC-2045
contradicts itself for it says, in the first paragraph of 6.4:

This is false. See below.

"... If an entity is of type "multipart" the Content-Transfer-Encoding
is not permitted to have any value other than "7bit", "8bit" or
"binary". Even more severe restrictions apply to some subtypes of the
"message" type."

That seems to suggest that the restriction applies to all multipart
types (which is fine by me) but not to message types unless some
particular message-type so specifies (as message/rfc822 indeed does).
This view is confirmed by the 7th paragraph of 5.1 of RFC-2046 (as
regards multipart-types) and the 3rd paragraph of section 5.2 of
RFC-2046 which states:

"Subtypes of "message" often impose restrictions on what encodings are
allowed. These restrictions are described in conjunction with each
specific subtype."

That's neither what it suggests nor says. What is says is that even
more restrictive rules apply to some subtypes of message, which is a
simple fact: The partial subtype of message is restricted to 7bit and
doesn't allow 8bit or binary.

The mistake you're making in your claim that this is contradictory is that it
refers to message/rfc822. It doesn't.

and the followinng section of RFC-2046 then goes on to specifiy exactly
such a restriction in the case of message/rfc822. That interpretation
would suit me just fine.

Only as long as message/rfc822 doesn't carry news messages. However, it
currently does.

HOWEVER, the 4th paragraph of section 6.4 of RFC-2045 then goes and
undoes the good work by forbidding other than "7bit", "8bit" or "binary"
with any composite media type.

Correct. And as I said before, this was a hotly contested point for which
there was strong consensus in the MIME WG. It is going to be nigh on to
impossible to undo it now.

This is most unfortunate, since it forbids for all time any message-type
for any kind of document which allows 8bit characters in its headers.
This is not a problem (yet) for rfc822-style email, but it will be a
problem for news articles constructed according to the new draft now in
preparation, and it will be a problem for any not-yet-thought-of kind of
internet document that might be proposed in the future.

It imposes no such restriction. The restriction is simply that CTEs can only
be applied to leaves in the MIME structure. There is nothing that says we
cannot define a composite type with 8bit or even binary headers in the future.

So that is the reason why we have proposed to obsolete message/news.

Again, I have no problem with obsoleting message/news, and to be honest
I don't really care that the grounds you claim for doing so aren't correct.

Well, I certainly agree with the outcome -- message/news should never
have been registered. The intent was always for message/rfc822 to
cover News, and it does so just fine. Creating additional composite
types is a serious business owing to the need to add support in a
large number of contexts. It should never have been allowed, and would
not be today, but back in those days we didn't have an appropriately
tuned type registration mechanism in place.

No, it is too much to expect message/rfc822 to apply to news, for that
will constrain the news standards and the mail standards and the mime
standards always to keep in exact step, and the whole IETF mechanisms
are too complex to ever expect to achieve consensus across such a
wide front at any one time. Currently, the Usenet-format group, which
started its work a couple of years later than DRUMS, and which is in the
fortunate position of being able to take advantage of a software base
for news transport which is already 8bit clean, has been able to take
the plunge and build 8bit in as a mandatory requirement for transports.
Specifically, it is proposing to a allow UTF-8 in headers.

This would be a valid point if message/rfc822 were strictly aligned with
RFC822. But it isn't -- the name, which was chosen and implemented long
before the exact rules for the type were put in place, is misleading.

Message/rfc822 was specifically designed to allow for material that isn't
legal according to RFC822 (and, I suspect, DRUMS). Specifically, the
requirements for what headers have to be present are substantially relaxed,
as are the syntax rules.

Now, if news goes off and designs something that has different overall header
syntax that isn't line-oriented or something similar, then I can see a case for
a new type. But until and unless something like that happens I don't think
there's adequate justification for adding a new composite type. 8bit in headers
certainly doesn't come close to making it a requirement.

But the reasons given here are specious. For one thing, the use of UTF-8 in
headers is an orthogonal issue to this one. We've had a strong consensus for
years that this is a place we want to get to eventually, but we have to 
a bunch of other stuff, most notably DRUMS, first. Only then will we have 
basis on which a design team can write a specification for UTF-8 headers 
actually works.

At which point you will undoubtedly run into this exact same problem.

Again, the problem you're seeing just isn't there. This isn't to say there
aren't problems -- major ones -- ahead, but this isn't one of them.

And as for the CTE issue, it was decided long ago that CTEs cannot be
nested. And this is why the restrictions on message exist, and will
apply equally to any attempt to register any other composite type,
regardless of what top-level type you put it under. And this is why
CTEs on inner parts can and will be changed by software as messages,
be they from News or whereever, move from News to mail or vica versa.

Millions of lines of code have been written in good faith based on
this choice. We don't go against choices we made in the past that have
led to widespread good faith implementations. As such, any document
that says I cannot do what I've previously been specifically told I
have to do to messages because they happened to originate in News,
doesn't stand much of a chance of being standardized.

How does this conclusion influence e-mail standards?

It isn't a conclusion, it is simply a paragraph in an Internet Draft. And 
which will receive strong objection if by some chance it makes it to IETF 

I don't see how you can object to the removal of a feature which you
just said should not be there in the first place.

Please reread my response. I said nothing of the sort. My objection was to your
attempts to violate the no nested encodings rule.

What you may not
like is the thing we propose to put in its place, which is two new
application types:


These are quite legal according to RFC-2045. The first one is no problem
(indeed it is already registered with IANA) but the second is an ugly
compromise which is bound to lead to less-than-optimal behaviour on
existing news and mail readers (until they are upgraded to recognise it
as requiring disposition-display). It would have been much easier to
have "message" recategorized as a non-composite type, but that is not
our call :-(.

Well, you're welcome to try and move forward in this way, but I predict major
trouble when you attempt to register a composite type that allows encoding.

One final note. As those of you who were present at the time may recall, I've
always been ambivalent about nested encodings. I never objected to them nor
insisted on them. But at the time I was the one lone abstain in a large room
full of strong objections.