ietf-822
[Top] [All Lists]

Re: New-ish idea on non-ascii headers

1991-09-19 17:45:59

Excerpts from internet.ietf-822: 17-Sep-91 Re: New-ish idea on non-asc..
Ned Freed(_at_)hmcvax(_dot_)claremo (4148)

I agree completely with Bob on this too. In proposing that mnemonic encoding 
be
used in headers, we are in a position to make NO changes to existing header
syntax. No change at all, period. The only, repeat ONLY, change would be to 
the
semantics applied to headers when they are DISPLAYED. And this change would
only apply when a new (preferably never before used) header line is present. 

I guess my concern with the "just-use-mnemonic" scheme is simply that
the implications for 822 header fields don't appear to have been fully
thought out.  In particular, from the August draft on mnemonic it is
kind of hard to figure out quickly which characters are permitted, since
that has been spun off to a separate document that I've lost my copy of,
but an earlier (May) draft suggests that the following characters all
may actually appear as part of mnemonic encodings:

! > ? - ( : , _ " / ; <

Well, sure. And these can all appear as part of a comment or a quoted string
or as part of an unstructured header field in RFC822 as well. The trick is
to quote things properly. Mnemonic does not change this.

I see the basic process as this:

(1) Start with text in the character set of your choice.
(2) Apply mnemonic encoding.
(3) Apply quoting as necessary (this depends totally on what you're 
    generating). Note that every single RFC822 field we want this to apply
    to is either unstructured or has what I call "complete" quoting rules --
    you can quote absolutely any US-ASCII text you want.

This process is reversed when you want to display such material.

Now, I don't have to tell any of the readers of this list that placing
these characters inside certain 822 headers will cause disasters.  If
the mnemonic version of my name includes all of these characters, for
example (because my middle name is really S! > ? - ( : , _ " / ; < -- I
just use "S." as a shorthand because of the problem it causes for mail!)
then consider the following, which might replace the "From" header on
the message you are now reading:

From:  "Nathaniel S! > ? - ( : , _ " / ; < Borenstein"
<S!>?-(:,_"/;<@my.host.name> (S! > ? - ( : , _ " / ; <)

This is not legal. In fact, this is periliously close to being a straw
man. Nobody ever proposed doing anything like this -- the
rules of RFC822 still apply. There's nothing special about mnemonic
that gets around the problems of RFC822.

If you want to make things simpler, you could say the mnemonic encoding
does not apply to, say, comments, and keep them in US-ASCII. Or you could
pick phrases (see my earlier list) not to encode. Or you could encode
each type of thing separately and mark which ones you did encode somehow.
There are all sorts variations.

I would also like to reiterate that the same problem Nathaniel identifies
here also applies when you're generating the regular headers (from the
Real- headers). You have to choose some subset of whatever 8-bit stuff
appeared on the Real- headers to appear in the regular headers (note
that the syntax of RFC822 does NOT permit the simple omission of all
the fields we're talking about here -- there are some fields, notably
phrases, that are syntactically mandatory.

As a matter of fact I don't see a lot of difference between Real- headers
and the earlier proposal to insert tags as pointers to other headers that
contain the real text in encoded form.

I doubt that there is a parser in the world that will be able to do
anything useful with this.  Even the comment is dubious.  Now, I don't
think anyone's seriously proposing that this be allowed, but I haven't
seen anything that indicates that this has all been thought out well
enough.  

Maybe I haven't been clear enough in my postings, but I think I have thought
this stuff through. One of the problems is that nobody has posted any
objections -- this is the first one I've seen, really. Since I don't
know what problems people think there are in this stuff I've been pretty
much posting blind, trying to cover various things I think people may have
objections to.

I wouldn't want to include this stuff in RFC-XXXX until all these issues
had been better addressed.  Meanwhile, though, since people aren't
enthused about any of the alternatives I've proposed, I'd be happy to
just leave non-ASCII headers out of RFC-XXXX entirely.  There's no
reason they have to be in the same RFC.  A companion RFC to the mnemonic
RFC might be the best place to talk about this.  Should we just simplify
things by leaving it out of RFC-XXXX?

I think this is an excellent idea for now. Let's get on with it!

                                        Ned

<Prev in Thread] Current Thread [Next in Thread>