Re: new-ish idea on non-ascii headers

Dave Crocker writes...

In my own, very strong, opinion, I believe that ALL participant-specific
information should be carried around with the address, and definitely,
absolutely , positively NOT factored into separate headers.  I
believe this stricture should be applied to all addresses, whether
part of the sending or part of the receiving set.


OK.  This is certainly consistent with parts of the "don't separate" 
comments that several people have raised.

Insofar as part of  the mission of the WG (but not necessarily of 
RFC-XXXX) is "fix 822", I think it is time to start thinking about 
models for extending, defining, and structuring address "phrases" with 
the intent of picking up non-ASCII information as part of that process.

As a strawman, consider the following alternative to "real-xxx":
  For compatibility with RFC822, RFC-XXXX messages MUST contain RFC822 
address fields (sending and receiving sets).  However, RFC-XXXX 
processors SHOULD ignore them entirely.
  RFC-XXXX processors MUST provide and support an additional set of 
address headers, even if those just duplicate the RFC822 ones.  Those 
headers have different names, e.g., XXXX-From,..., and have the RHS 
syntax
   participant-specific-information <address>,...
where "address" is required to be in the character glyphs of invariant 
ASCII and, in 7bit SMTP environments, must be in ASCII.  And it must 
conform to all of the other 822 address rules, also.
  And "participant-specific-information" must have sufficient structure 
that it can contain mnemonic or quoted-printable and a structure that 
permits lexographically distinguishing between personal-name, 
organization, phone number, etc. (and, because of the "etc.", must be 
extendable in some clear and unambiguous way).
   Any originating RFC-XXXX processor (sending UA or MTA) that fails to
generate both sets of fields is broken, and a pox upon them.  Any 
destination RFC-XXXX processor (receiving MTA and UA) that pays any 
attention to the RFC-822 headers on receipt is broken, and a pox upon 
them also.  Any message-transforming intermediate MTA that does not 
figure out how to change both sets consistently is broken, and ditto.

Note that this neatly expands to other transports, e.g., transport of 
DIS 10646bis characters in native 32 bit form.  One would just insist on 
using the glyphs of ASCII in the addresses themselves, would use 
unrestricted 10646 in the participant-specific-information, and, if 
someone needed to push this into a 7bit environment, the characters 
would be translated to ASCII and a designated encoding respectively.

There is, still, a synchronization problem here, but the only processors 
that really have it are those intermediate MTAs that decide to go into 
the conversion business.   Originating RFC-XXXX hosts presumably accept
only the XXXX-forms and generate the 822-forms from them, presumably by
discarding everything but the actual address, so they don't have a synch
problem.  And one doesn't have to sychronize organizations, phone 
numbers, and latitudes and longitudes with the user addresses, which is 
much harder.

Please understand that this is strictly a strawman--I'm trying to get 
people to think about this problem a little more generally.

Now I'm going to go put on my flame-proof suit.

    --john