Re: The KTH-proposal of a solution to the header-problem

Patrik,
  I found your proposal very useful in describing the issues and 
problems, many of which it appears to solve.  It also eliminates my 
objection to solutions proposed for one or two headers, leaving other 
headers for "later", for which I am grateful.

A few brief comments on a few aspects of it.

(1) As one or two others have pointed out in the last few days, the one
major issue which I left out of my summary would be, for example, 
someone in Sweden or Germany addresses a letter to someone in Russia, 
and both wish to spell names in their individual languages and character 
sets.  This would result in one character set in the phrase of the From 
field and a different one in the phrase of the To field.
  In a pathological case--possibly involving distribution to people in 
both countries--someone might want to include both the German and a
Russian translation in a Subject field.
  While I am personally less concerned with the multi-character-set
Subject problem than with the personal names, it seems to me that 
neither of these can be dealt with under your proposal without use of 
multioctet "universal" character sets or mnemonic (quoted-readable).  Is 
that your intent?

(2) Just to be sure that we do not misunderstand, when you say...

A) Header-Transfer-Encoding

Allowed Header-Tranfer-Encoding types are:
....   
  - 8-bit

  This would, of course, require agreement (however that is determined) 
on an 8-bit transport arrangement.

(3) I think we must be very careful about the potential for 2nd DIS 
10646.  We could have, I believe, gotten ourselves into serious trouble 
in the first part of the year had we followed the advice of several 
people that both the general structure of 10646 and the availability of 
compaction mode 5 were fixed and would not change.
  The notion of "Internet standard approval pending final approval of
ISO Standard" really does not work, since, if many people implement and 
deploy a solution, and then the ISO Standard changes drastically (as 
will certainly be the case between 1st DIS 10646 and 2nd DIS 10646), we 
are faced with having to choose between:
 -  invalidating and changing a number of existing implementations, 
upsetting users and vendors alike        and
 - ending up with an Internet standard that is based on an obsolete 
draft version of an ISO standard and which, as a result, is not 
supported for any purpose other than Internet transport and which must 
be maintained within the Internet concept.
   Neither is attractive.

   As a result, I think we must examine very carefully any solutions 
that remain only "partial" "until 10646 is approved".
   This argument applies with even more force to the AUC proposal.  I 
think we can all be quite confident that, sooner or later, there will be 
an IS 10646, even though I am reluctant to speculate on what might be in 
it.  But to focus on a proposal for a particular feature or supplement 
to 10646 seems to me to be premature, as we should have learned from 
compaction method 5.

In particular, you say, under "header type"...

Only one will not be used:

  - ISO-10646

      Out of date. (We suppose ISO-10646-ACU will be used instead)

  Again, I don't think we disagree but, since minor misunderstandings on 
this list have periodically led to major explosions, (i) there never has 
been an ISO 10646, there has only been First DIS 10646.  The latter is 
obsolete, the former has not yet existed.  See above for remarks on the 
AUC (I assume that is what you meant) proposal.


(4) Special (non-ASCII) character sets

Because of the possible introduction of multi-octet character sets,
RFC-822 schould be interpreted like this:

     Any reference to a specific character in RFC-822 is
     interpreted as a reference to the octet that represents
     this character in US-ASCII.


Unless you propose to forever ban any character set that does not have 
the glyphs of US-ASCII as an unambiguous subset (e.g., not permit 
extensions from the sets you have listed), this language is not quite 
sufficient.  Such a prohibition might be reasonable, but, if it is what 
you intend, the final proposal will need to be specific about it.
 There is some language in the revised and commented RFC-ZZZZ, which 
will be posted to the other list or tomorrow.  It deals with this issue
in the transport context and, I think, in a very general way. 


(5) Quoted-readable/Mnemonic
Your proposal says...

Note that we only have to introduce extra quoting because of the character
set only arise if we use Quoted-Readable, ISO-2022 and ISO-10646-AUC.

    I don't understand the problem in this situation with Mnemonic.  
Indeed, mnemonic, since it is based on the glyphs of 10646 (and is a 
superset of them) but not the structure of 10646, is the only proposal
on the table that gives us the properties of a universal character set
(e.g., different languages in different headers but within the context
of your proposal) without anticipating the approval of an International
Standard in a particular form.   This, of course, assumes that the Asian 
characters can be worked out in a reasonable fashion.
   ISO 2022 raises several additional problems, arising primary because 
it is not a character set but a collection of separable rules.  If we 
can avoid providing for it in headers, we can avoid replaying all of the 
arguments about it that ran through the list in the (northern 
hemisphere) spring.

(6) Header parts...

C) What parts of the header does the new header control

Headers:
      Subject:
      Comments:
      Content-Description:
      Summary: (from RFC-1036)
      Organization: (from RFC-1036)


  I would appreciate a comment from the chair on this, but I think IETF 
should be quite reluctant to authorize, or specify treatment for, header 
fields for which Internet standard, or at least standards-track, 
descriptions, do not exist in a standards-track proposal.  An 
implementor must know what to implement, and informational and 
experimental RFCs are not satisfying in this regard.  I would welcome a 
standards-track RFC that would specify some of these optional fields, 
but that document would be the place to specify their character set 
encoding.

      ...and all user-defined fields in RFC-822.

   I don't understand what this is intended to mean.  If you are 
referring to the X- fields, the comments above apply even more strongly: 
I don't know how one can specify things one hasn't seen, and whatever 
agreements govern the use of these fields can also include their 
interpretation.

(7) The Received line...
  Your discussion should be extended to all "trace fields", including 
Return-path.  Even though Return-path should be inserted into headers 
only by the final delivery MTA, they have a tendency to creep into 
transported headers and the treatment should be clear.
  Otherwise, your analysis exactly parallels the analysis I recently 
completed from the transport perspective.  Either it is correct or we 
are simultaneously very confused.

(8) Remarks on remarks...

At the beginning we had RFC-822. It defined the characters in the headers
to be 0-127. Some parts of the headers included special characters.

   No, it didn't.  This has been the source of much of the confusion and
criticism of the last few days.  It defined the characters in the 
headers to be [US] ASCII, ANSI X3.4, not semi-arbitrary patterns of 7 
bits.

When we start to use more characters that the 0-127 we have to encode
them in some way. Why not do that in the same, or at least equivalent way,
as the Body of the mail. We therefore saw the possibility to introduce
the Header-Transport-Encoding and the Header-Type.

   Strictly speaking, the first sentence above should begin "When we 
start to use any characters, or interpretations of character positions, 
that are not in [US] ASCII (ANSI X3.4), we have to encode...".  In other 
words, while no implementation has been able to detect the violation or 
enforce the rule (perhaps fortunately), use of a national variation on 
ASCII has technically always required some encoding.

    --john