Re: comments on latest MIME drafts

Ned Freed <NED(_at_)SIGURD(_dot_)INNOSOFT(_dot_)COM> writes:

Then suggest alternative prose. We're years past the point where such
generalities are acceptable input.


We appear to have different paradigms, so we have to discuss
generalities in order to discover prose that is commonly acceptable.

I sort of agree with you about non-content headers actually being allowed in
body parts. They are allowed syntactically but they have no sematics 
associated with them.


There exists at least one MIME UA which puts "X-" headers in the
body-parts of multiparts in order to communicate with other instances
of itself.

You tend to see things almost exclusively in terms of syntax. I do not. I see
the various aspects of MIME in terms of the abstractions they are intended to
represent and how their semantics tie in with those abstractions. Syntax then
follows as secondary (or tertiary) concern.


It was clear that we had different paradigms and thus had problems
communicating.  Until I read this, i did not know what your paridgm
was.

Let me clarify my paradigm and suggest why it might be better than
yours.

In order for a data format to exist in the real world, it has to have
a syntax.  Data format specifications, such as MIME, usually specify
abstractions and give their semantics.  The semantics are then tied to
identifiable objects in the syntax.

Having the semantics be associated with identifiable syntactic objects
simplifies the task of generating and reading the data format.
Composers generate the syntactic constructs corresponding to the
semantics they want to convey.  Readers discover semantics by first
doing a parse to discover the syntax, then applying the association of
semantics to particular syntactic objects.

When semantics are not associated with syntactic objects, or when the
syntactic objects associated with given semanitics are not clearly
identifiable, then having a reader discover those semantics is
problematic.

And there is quite clearly a HUGE
difference in the abstract between a body part and a message.


In the abstract, they each have some features in common and some
features which differ.

The semantics they have in common (header/body syntax, content-
headers) I am trying to associate with the syntactic object known as
an "entity".

The semantics specific to a message (MIME-Version, destination and
other 822-defined fields) I am trying to associate with the syntactic
object known as a "message", which is a subset of an "entity".

The semantics specific to a body part (contained in a multipart, does
not require MIME-Version, may not contain enclosing multipart's
delimiter) I am trying to associate with the syntactic object known as
a "body-part", which is a disjoint subset of an "entity".

The fact that they have nearly identical syntax does not mean that
they are in fact the same -- they aren't. As such, trying to tie
these things down using syntax as a distinguishing factor clouds the
issues rather than clarifying anything.


They have some syntax (and associated semantics) in common, and they
have some syntax (and associated semantics) by themselves.

This is similar to comparing the value of a Sender: field with the
value of a Message-ID: field.  They both have some common semantics
associated with their included common syntactic object of an
addr-spec.  They are not, however, the same, but tying down their
common semantics using syntax does clarify things.

Part of the problem surrounding the term "body part" is that this
term has a well understood meaning semantically. It always has.


Actually, I don't think the meaning of the term is well understood.
It appears to be used for at least two different concepts.

The term "body part" refers to headers and contents of either a
message or one of the parts in the body of a multipart entity. Any
sort of header may be present but only the content headers actually
have any meaning in the context of a body part. A body part has a
header and a body, so it makes sense to speak about the body of a
body part.

Is this acceptable?


I think it's a bit confusing.  It also defines a term that is
a different concept than the body-part syntactic object, which has
semantics which do not apply to messages.

Again, syntax and semantics are getting confused. Body parts can have
non-content headers, but such headers do not tie into the body in any way.


"X-" headers in a body-part nonterminal are specifically allowed to
have private meanings.

Message bodies, Section 8.4, paragraph 1: this use of "body part" is
specific to multiparts.


Incorrect. CTE headers appear in contexts other than multiparts.


The previous sentence covers CTE headers appearing in messages.  It is
probably better to have a single sentence "If a
Content-Transfer-Encoding header field appears as part of an entity's
header, it applies to the entire body of that entity."

The problem is that we define rules for body parts that have to
apply to both the multipart and single part cases but don't apply to
all entities.

Could you give me an example of such a rule?


Any rule that defines something specific to MIME represents such an example.
Entities can be messages and messages don't have to be MIME messages, hence
MIME rules don't necessarily apply to all entities.


Here's another difference in paradigms.  In my paradigm, MIME rules
apply to all messages.  None of my MIME parsers ever look for
MIME-Version.

The MIME standard certainly permits this.  The MIME rules are
constructed such that if there are no Content-* headers, the MIME
rules are identical to the RFC 822 rules.  RFC 1049 Content-Type:
headers are not syntactically legal MIME Content-Type: headers, so a
MIME reader has the freedom to treat RFC 1049 Content-Type: headers as
it likes.

Even with the definition of "body part" in
draft-ietf-822ext-mime-imb-03.txt, messages which "aren't MIME
messages" have associated body parts.  Take the (presumably zero)
content headers of the message, along with the body and there's your
"body part".

If a message doesn't have a MIME-Version, then a receiving UA has the
option, given in section 6 of the message bodies document, of ignoring
all rules in MIME applying to the message body, including any rules
imposed by the fact that the message is an entity.

      entity = *field [ CRLF *OCTET ]

It then follows that "message" and "body-part" are subsets of
"entity".


This is OK syntactically, but loses seriously on the semantics front. I
don't see any way of making this change without making things even more
confusing.


How does it lose?  You apply all the rules that you want for both
messages and body-parts, including the various Content-* headers, to
entities.  It appears to me to be a semantic win.

bodies are no longer restricted to 7bit data, so "*text" isn't
appropriate.


Neither is the original definition then. I'll change it to read:


The original definition of message from 822 was specific to 7bit data.
MIME expanded this in prose, but not in BNF, to be *OCTET.

By the way, I neglected to formally define OCTET.

I actually like your original suggestion that we refer to the
encoding of the body rather than that of the body part or entity
better, so that's what I'll change it to.


Excellent.  This fits much better with the Canonical Encoding Model.
It is the body that the transfer-encoding is applied to, this
application is simply labeled in the headers of the body's enclosing entity.

-- 
_.John G. Myers         Internet: jgm+(_at_)CMU(_dot_)EDU
                        LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up