In my opinion the _subproblem_ of extending the _*text-type_
headers should be very easy to solve: These are headers
intended only to be interpreted by human readers. The have no
internal structure. Users expect to be able to use all those
characters in the Subject line that they can use in the rest of
the message. Exactly the same machinery for decoding incoming
messages and display them to the user that must be used for an
RFC-XXXX message body can be applied to these headers.
I really can't see any problem with the idea to introduce two
new headers
Text-Header-Field-Type:
Text-Header-Field-Transfer-Encoding:
which are strictly parallel to
Content-Type:
Content-Transfer-Encoding:
(There should be _separate_ Type and Encoding headers for the
content of text headers. If e.g. the Content-Type is Multipart,
you may still want to write the Subject line in ISO-8859-1. If
it is Image, the Content-Description header may contain
important information to the receiver that should be writable in
other languages than English.)
Because this addition is so simple and harmless I don't think it
should delay the finalization of RFC-XXXX much.
You suggest a temporary "status quo" solution for headers.
Message headers MUST be in US-ASCII, with the exception of the
text of a Subject: or Comment: header. The text of a Subject:
or Comment: header is unparsed in its entire content, and MAY
be in an alternate character set provided that that alternate
character set otherwise follows the characteristics of US-ASCII
This has several disadvantages in practice, though:
- The writer of a message will not able to use the same wide
character repertoire in the Subject line as in the rest of
the message. Since most ordinary users, not without good
reason, regard this header as only the first and most visible
line of what they can write in an electronic letter, they
will probably regard any special restriction as troublesome
and highly unnatural.
- The UA must be able to handle character codes in headers
which RFC-XXXX makes unnecessary in the message body and even
explicitly recommends not using there. "It is the opinion of
the authors of this memo that a large number of character
sets is NOT a good thing." to quote directly from the May
draft of RFC-XXXX.
- The writer of a message must in some way tell the UA which
character code to use in the text headers (so that the UA
either can refuse to accept characters in the repertoire of
the message body that can't be represented in the 7-bit code
of the header, or can use a proper conversion from the
message body character code to the text header character
code). E.g. in Sweden every user that sometimes writes
messages also in other languages than Swedish will be
affected by this unnecessary requirement.
- The receiveing UA has no way of finding out which national
version of ISO 646 to use for displaying the Subject header.
It will probably use a reasonable default. In Sweden most
users will have their UAs display the Subject line in
Swedish 7-bit code, but then incoming mail from Germany or
France, coded with their ISO 646 variants, will seem to have
very peculiar Subject lines. (German U umlaut will be
displayed as an A with ring above, French small c with
cedilla will be displayed as a capital O umlaut etc.)
And all this trouble for email users, system managers
(explaining the special case of the Subject line to users) and
UA implementers is totally unnecessary. A clean solution is
already there in RFC-XXXX. We only have to extend it from the
message body to the text-only headers.
--
Olle Jarnefors Internet:
ojarnef(_at_)admin(_dot_)kth(_dot_)se
Information Management Services UUCP: ...!uunet!mcsun!sunic!kth!ojarnef
Royal Institute of Technology (KTH) BITNET: ojarnef(_at_)sekth Fax:+46-8-10
25 10
SE-100 44 Stockholm, Sweden Phone: +46-8-790 71 26 Telex:11421 KTH S