ietf-822
[Top] [All Lists]

Re: UTF-7 vs. UTF-8 for fallback charset?

2001-12-05 12:48:14

Hi,

The following might be of some use to you in making a decision.
Let me provide background info on what we did at Netscape for Communicator 4.x and Netscape 6/Mozilla mail.

1. Prior to Comm 4.5, we recommended UTF-7 as the default Unicode charset for mail.

2. After consulting several mail experts inside and outside Netscape, we decided to go with UTF-8 as the default charset for Comm 4.5 primarily for 2 reasons. First, UTF-7 has never gained the status of the default Unicode charset and instead UTF-8 was becoming the default Unicode charset. Second, 8-bit encodings are not that problematical these days as evidenced by the fact that they are used as standard mail encodings in Korean, Chinese and European languages.

3. While UTF-7 can be displayed by most major mail programs today, there is a question of promoting a single Unicode mail encoding. At least we thought it would be better if we stick to one encoding for Unicode sending messages. For dusplaying messages, we do support UTF-7.

4. There is a bit of problem in Mozilla/Netscape 6 for replying to messages encoded in UTF-7. The mail compose window lists only standard (or near-consensus standard) encodings. In replying to a UTF-7 message, the charset of the reply mail will reflect the original encoding. But that item (UTF-7) will not be visible on the Compose window encoding menu because it is not on the default list of the recommended mail-send encoding. One might say that this is a bug in Mozilla but my feeling is that we should promote only standard or near standard mail encodings. I don't believe we should unnecessarily increase send-mail encodings in use.

- Kat

Marc Mutz wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 05 December 2001 16:37, Keith Moore wrote:
Is it safe to make UTF-7 the charset to use as "last resort" instead of UTF-8?
No. The only charset supported by all major MUAs is ASCII.

Though I think you know very well what I meant, let me restate the questions more precisely ;-) When sending a textual mail, we have to choose a charset that is capable of representing all contained characters (internally, we use unicode) to use for the text/plain body part. We currently try us-ascii,<locale charset>,utf-8 (in that order). So you could see UTF-8 as a fallback for _text/plain bodypart charsets_. Now, we are discussing making UTF-7 that default instead of UTF-8, because it doesnt need further Content-Transfer-Encoding processing on top of it. Would we seriously decrease interoperability with other mail clients by doing this? And don't tell me: "Yes, with all those who don't speak anything but us-ascii" because they are not a likely target of a mail that someone sends, but which can't be represented in his own locale's charset. Also, the question is not "shall we use charsets other than us-ascii", but "is it OK to use UTF-7 where we'd else use UTF-8"? Hope I made my question clear enough this time. :-) Marc - -- An der Prioritätenskala wehrhafter Demokratien gibt es nichts zu deuteln: Erst kommt die Stärke, dann Freiheit Hand in Hand mit der Gerechtigkeit und in der Nachhut trottet der Frieden hinterher. -- Goedart Palm "Der Friedensnobelpreis und ein bisschen Krieg" Telepolis 2001/10/13 (#9807)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8Dmmc3oWD+L2/6DgRAj3uAKCbzdKqh099aal6TbAjg/zh7zZKNACfRJ3W
vIvm0mtF8Tu5eZ8BwHsqVtE=
=2mo6
-----END PGP SIGNATURE-----


--
Katsuhiko Momoi <momoi(_at_)netscape(_dot_)com>
Senior International Manager, Web Standards/Embedding
Netscape Technology Evangelism/Developer Support