ietf-822
[Top] [All Lists]

Re: SPEAK NOW OR HOLD YOUR PEACE (Was Re: Yet another proposal for non-ASCII chars in headers)

1991-10-25 15:51:22
Date: Fri, 25 Oct 91 20:37:59 +0100
From: Keld J|rn Simonsen <keld(_at_)dkuug(_dot_)dk>
To: NED(_at_)hmcvax(_dot_)claremont(_dot_)edu, 
nsb(_at_)thumper(_dot_)bellcore(_dot_)com
Subject: Re: SPEAK NOW OR HOLD YOUR PEACE (Was Re: Yet another
proposal for non-ASCII chars in headers)
Cc: ietf-822(_at_)dimacs(_dot_)rutgers(_dot_)edu

Concerning Keith Moore's header proposal:
[...]
2. If we go along with the Moore proposal, I would like two
things addresses:

a. The mnemonic encoding added, perhaps as encoding "M" and as defined
on the Freed/Smart proposal. This would improve readability in many
cases for unextended viewers.

I don't personally have any objection to defining a charset-number
for Mnemonic.  However, I don't think Mnemonic should be an encoding,
because the Mnemonic text will still need to be Q-encoded to be legal
in some places in the message header.

b. It seems to me that there are problems in the text string, as
special characters are allowed there. This certainly leads
to problems, even with conformant rfc822 implementations.
Some encoding of the problem characters like in the Freed/Smart proposal
is needed.

I'm not sure what you mean by "text string", but the intent of
my proposal was to:

(a) disallow any unencoded special characters in a context where those
special characters would cause problems  (e.g. in a "phrase" preceding an
address)

   Of the printable non-alphanumeric characters, only "!", "*", "+", "-",
   "/", "=", and "_" are allowed in this context.    (Will any of these
   characters cause problems?)

   This automatically happens if the text is B-encoded.  If Q-encoding is
   used, other characters must be represented in hex prefixed with an '='.
   (20 hex can be represented by "_".)

(b) allow use of special characters where they would cause no problems (e.g.
in a Subject header), for maximum readability.

   The only restrictions imposed here are that "?" and SPACE are not
   allowed, because these are needed for delimiting encoded-words.

   There is a problem with the current draft in that it allows an
   encoded-word to contain "(", ")", or "\" when appearing in a
   "ctext".  These characters should not be allowed to appear within
   a comment.

c.  the same encoding of problem characters should be done for the
chaset field.

   This is a problem with my proposal as currently written.  Either RFC XXXX
   could further restrict the possible set of charset names so that no
   encoding is necessary when these names are used in an encoded-word, or I
   could change my proposal to state that the charset-name field is always
   implicitly Q-encoded.