--On Wednesday, December 5, 2001 16:08 +0100 Marc Mutz
<mutz(_at_)kde(_dot_)org> wrote:
KMail now has UTF-7 support (ie. it does understand that charset, but
doesn't use it actively).
Are there any interoperability concerns with UTF-7?
Yes, there are.
First, many people don't realize that UTF-7 is actually a double-encoding.
It's a second layer of encoding on top of UTF-16 which is an encoding of
UCS-4. As of the most recent Unicode spec, this is a problem for
non-English languages. UTF-8 is a single encoding (yes, quoted-printable
UTF-8 is double encoded, but it's handled cleanly by separate layers and
isn't needed often).
Also suppose I don't have a Unicode-aware email client, but I do have a
Unicode aware editor. Most Unicode aware editors support UTF-16 and UTF-8,
but don't support UTF-7. Thus UTF-8 is more likely to be readable by a
recipient than UTF-7. And in the rare case 7-bit encoding is needed,
virtually every client can remove quoted-printable, while only UTF-7 aware
clients can do anything useful.
Also note that US-ASCII is _not_ a subset of UTF-7, since UTF-7 steals a
character as an escape. This can cause all sorts of interoperability
problems.
I could go on with more interop problems.
UTF-7 is a really bad idea in email. _Please_ don't generate it.
- Chris