ietf-822
[Top] [All Lists]

UTF-7 sucks (was Re: Format=Flowed and Non-ASCII Characters)

1998-10-30 12:46:29
On Mon, 26 Oct 1998, Ashley Yakeley wrote:
...but I'd like to point out that quoted-printable may not be necessary 
to encode non-US-ASCII, given UTF-7.

The question is, which is better for the existing (international) base of 
installed users as would be recipients: using qp (at the transport 
level), or using UTF-7 (at the content-type level)?

NO! NO! ARRRGGHGGHGHH!   Don't use UTF-7 in email.

Am I going to have to dust off the draft on "UTF-7 in MIME considered
harmful"?

UTF-7 has the following problems:

* far worse backwards compatibility than quoted-printable.  Some people
hate MIME due to what they consider "quoted-unreadable".  If you use
UTF-7, there will be lots of people who will start hating Unicode as a
result.

* Massive layering violation -- it uses a content-transfer-encoding in a
character set, thus forcing base64 code in multiple layers of code.

* Double-encoding.  UTF-7 is base64-encoded UTF-16 which is itself an
encoding of UCS-4.  Bletch.

* Searches are impossible without decoding UTF-7.  (and it can't be mapped
internally on a signed message, since there's no canonical form).

* The uptake of the 8BITMIME SMTP extension has been excellent.  This
means that unencoded UTF-8 with CTE 8-bit is usually possible.  UTF-8
doesn't have double-encoding or searching problems.

I could go on.  People who generate UTF-7 are doing a great disservice to
the Internet, IMHO. 

                - Chris


<Prev in Thread] Current Thread [Next in Thread>