On December 15, 1999 at 16:53, Koichi Nakatani wrote:
Still: imho the proper thing to do would be to honor the language
of the incoming message all the way to the generated HTML.
Imagine you have an international list. People post in Korean,
Chinese, some even in English. Which encoding do I want to
force on them? None!
I can even imagine a message with two parts, one in Korean, and
the other in Chinese. In that case, text in two languages must
Yep, and different character sets could be used. HTML is not
designed to switch character sets within the same document.
coexist in a single HTML file. What character encoding scheme
can be used?
I would recommend UTF-8N in such a case. Pattern matching in
UTF-8N (or UTF-8) is relatively easy.
UTF8 support should become standard in Perl soon. The text/plain
filter could be modified to support translation into UTF8. I am
unsure if there will be a standard set of charset->UTF8 converters.
Another possibility is to use ISO-2022-JP-2 encoding or its
variants. But not many people actually use this encoding, and
I imagine very few people want to go along with the nightmare of
ISO 2022 style.
Let's avoid this.