Keith, in general I agree with all your changes. I'm only going to comment
in places where I have something additional to say.
1. Title
Problem: title doesn't include the word "MIME"
Suggested fix: rename document to:
"MIME: Message Header Extensions for Non-Ascii Text"
(Maybe the docs should say "MIME: Part 1" and "MIME: part 2"?)
I think this is an excellent idea and one we should definitely implement.
4. multi-byte character sets:
Problem: RFC 1342 did not consider multi-byte character sets,
and character sets with switching sequences (e.g. ISO-2022-JP).
Suggested fixes:
1. An encoded-word must encode an integral number of characters.
2. If a charset uses code-switching sequences to switch between "ASCII
mode" and other modes, each encoded-word implicitly begins in "ASCII
mode", and if necessary, must contain appropriate sequences such that
the charset interpreter is again in "ASCII mode" at the end of the
encoded-word.
Another way to handle it would be to declare any modes as reset automatically
at the end of an encoded word. I would recommend a combination of both: State
that all software generating encoded-words MUST reset to "ASCII mode" at the
end of the encoded word, and all display software MUST perform such a reset
operation regardless of what state it is left in.
Do we want to say US-ASCII and not ASCII here?
5. conformance section:
RFC 1342 currently states:
A mail composing program claiming compliance with this specification
MUST ensure that any string of printable ASCII characters in a
message header that begins with "=?" and ends with "?=" be a valid
encoded-word.
There are many places in a header where such strings are legal, but where
an encoded-word isn't. For example, in an address:
To: =?foo?=(_at_)some(_dot_)where
We should not require the mail composer to quote the "=?foo?=" (since
this might even change the meaning of the address), and we don't want
this treated as an encoded-word.
Suggested fix:
Change the above "compliance" paragraph to read:
A mail composing program claiming compliance with this specification
MUST ensure that any string of printable ASCII characters in a "text"
entity within a header, or any "atom" within a "phrase", that begins
with "=?" and ends with "?=" be a valid encoded-word.
Don't you also need to mention "ctext" here?
6. header folding:
Problem: RFC 1342 contains the sentence:
"Message header lines that contain one or more encoded-words should be
no more than 76 characters long."
Someone has suggested that this might be misconstrued to restrict
the length of an entire header field.
I agree most emphatically; people often cite RFC821 restrictions on line
lengths as restrictions on overall header length. This sort of thing has
to be nipped in the bud.
Suggested fix: Change to "Each line of a message header field that
contains an encoded-word should be no more than 76 characters long."
Remphasizing that header fields have no real length restrictions at all would
also be appropriate.
That's it. A clean and clear set of changes overall.
Ned