Re: comments on latest MIME drafts

Ned Freed <NED(_at_)SIGURD(_dot_)INNOSOFT(_dot_)COM> writes:

Then suggest alternative prose. We're years past the point where such
generalities are acceptable input.

We appear to have different paradigms, so we have to discuss
generalities in order to discover prose that is commonly acceptable.

I sort of agree with you about non-content headers actually being allowed in
body parts. They are allowed syntactically but they have no sematics
associated with them.

There exists at least one MIME UA which puts "X-" headers in the
body-parts of multiparts in order to communicate with other instances
of itself.


So what? The only semantics that matter here are those of MIME. It is of course 
permissible for people to add their own headers with private semantics.

Let me clarify my paradigm and suggest why it might be better than
yours.

In order for a data format to exist in the real world, it has to have
a syntax.  Data format specifications, such as MIME, usually specify
abstractions and give their semantics.  The semantics are then tied to
identifiable objects in the syntax.

Having the semantics be associated with identifiable syntactic objects
simplifies the task of generating and reading the data format.
Composers generate the syntactic constructs corresponding to the
semantics they want to convey.  Readers discover semantics by first
doing a parse to discover the syntax, then applying the association of
semantics to particular syntactic objects.


I disagree 100% with all of this. People do not discover sematics by
implementing parsers. They discover them by reading specifications. We have a
serious problem if people have to implement parsers before they can understand
MIME. The set of people who need to understand MIME semantics is far larger
than the set who worry about specifics of syntax, let alone go so far as to
implement parsers.

When semantics are not associated with syntactic objects, or when the
syntactic objects associated with given semanitics are not clearly
identifiable, then having a reader discover those semantics is
problematic.


Well, sort of. It certainly helps for semantics to be bound to various syntax
elements, as they are in MIME. But it certainly isn't necessary for semantics
to exist, nor is it necessary for different semantic constructs to bind unique
syntactic elements. In fact it can be quite the opposite -- dates appear in all
sorts of places in header fields, but I don't hear anyone suggesting that the
semantics of date need to be represented differently in all of these fields or
that dates are not important entities semantically.

And there is quite clearly a HUGE
difference in the abstract between a body part and a message.

In the abstract, they each have some features in common and some
features which differ.


Right. They are different.

The semantics they have in common (header/body syntax, content-
headers) I am trying to associate with the syntactic object known as
an "entity".


And I think this is a very bad idea. Entities are more general than this.

The semantics specific to a message (MIME-Version, destination and
other 822-defined fields) I am trying to associate with the syntactic
object known as a "message", which is a subset of an "entity".


Fine with me.

The semantics specific to a body part (contained in a multipart, does
not require MIME-Version, may not contain enclosing multipart's
delimiter) I am trying to associate with the syntactic object known as
a "body-part", which is a disjoint subset of an "entity".


And this flies in the face of common usage, common understanding, and common
sense. It makes MIME much harder to understand, and I am not willing to do it.
This is an absolute showstopper for me.

The fact that they have nearly identical syntax does not mean that
they are in fact the same -- they aren't. As such, trying to tie
these things down using syntax as a distinguishing factor clouds the
issues rather than clarifying anything.

They have some syntax (and associated semantics) in common, and they
have some syntax (and associated semantics) by themselves.


But they also each have their own semantics as well as their own syntax.

This is similar to comparing the value of a Sender: field with the
value of a Message-ID: field.  They both have some common semantics
associated with their included common syntactic object of an
addr-spec.  They are not, however, the same, but tying down their
common semantics using syntax does clarify things.


Its all a question of where you draw the lines.

Part of the problem surrounding the term "body part" is that this
term has a well understood meaning semantically. It always has.

Actually, I don't think the meaning of the term is well understood.
It appears to be used for at least two different concepts.


Well, if you mean that there's a well understood common sense meaning that
is what most people mean when they say "body part", versus the old,
nonsensical definition that managed to slip into MIME, then I certainly
agree.

The term "body part" refers to headers and contents of either a
message or one of the parts in the body of a multipart entity. Any
sort of header may be present but only the content headers actually
have any meaning in the context of a body part. A body part has a
header and a body, so it makes sense to speak about the body of a
body part.

Is this acceptable?

I think it's a bit confusing.  It also defines a term that is
a different concept than the body-part syntactic object, which has
semantics which do not apply to messages.


What semantics does it have that don't apply to MIME messages?

Again, syntax and semantics are getting confused. Body parts can have
non-content headers, but such headers do not tie into the body in any way.

"X-" headers in a body-part nonterminal are specifically allowed to
have private meanings.


Of course. But they don't have any meaning that's defined in MIME. And that's
all that matters here.

The problem is that we define rules for body parts that have to
apply to both the multipart and single part cases but don't apply to
all entities.

Could you give me an example of such a rule?


Any rule that defines something specific to MIME represents such an example.
Entities can be messages and messages don't have to be MIME messages, hence
MIME rules don't necessarily apply to all entities.

Here's another difference in paradigms.  In my paradigm, MIME rules
apply to all messages.  None of my MIME parsers ever look for
MIME-Version.


The Working Group rejected exactly this paradigm some time ago, preferring
instead to go with the approach of MIME messages being a proper subset of
RFC822 messages.

The MIME standard certainly permits this.  The MIME rules are
constructed such that if there are no Content-* headers, the MIME
rules are identical to the RFC 822 rules.


It does indeed permit this, because this is the way we originally planned
to do it.

RFC 1049 Content-Type:
headers are not syntactically legal MIME Content-Type: headers, so a
MIME reader has the freedom to treat RFC 1049 Content-Type: headers as
it likes.


Not if it treats all messages as MIME messages.

Even with the definition of "body part" in
draft-ietf-822ext-mime-imb-03.txt, messages which "aren't MIME
messages" have associated body parts.  Take the (presumably zero)
content headers of the message, along with the body and there's your
"body part".


Sure. This can happen in MIME messages as well.

If a message doesn't have a MIME-Version, then a receiving UA has the
option, given in section 6 of the message bodies document, of ignoring
all rules in MIME applying to the message body, including any rules
imposed by the fact that the message is an entity.


Sure, but so what?

      entity = *field [ CRLF *OCTET ]

It then follows that "message" and "body-part" are subsets of
"entity".


This is OK syntactically, but loses seriously on the semantics front. I
don't see any way of making this change without making things even more
confusing.

How does it lose?  You apply all the rules that you want for both
messages and body-parts, including the various Content-* headers, to
entities.  It appears to me to be a semantic win.


Because it blurs the distinction between body parts and messages.
The early MIME work presented us with substantial evidence that losing
this distinction is very bad.

bodies are no longer restricted to 7bit data, so "*text" isn't
appropriate.


Neither is the original definition then. I'll change it to read:

The original definition of message from 822 was specific to 7bit data.
MIME expanded this in prose, but not in BNF, to be *OCTET.

By the way, I neglected to formally define OCTET.


I already corrected this omission.

                                Ned