ietf-822
[Top] [All Lists]

Re: internationalization of mail

2004-08-27 03:21:33

On Aug 27 2004, Tex Texin wrote:

If a Thai font is used, expecting data in a Thai encoding (TIS
620-2533 for example) it may not matter that the data is mislabeled
as long as it is unchanged it will display the right character for
each byte. But if the Thai data has gone thru a conversion from
8859-1 to utf-8, the bytes will no longer show the correct Thai
characters, so for this situation the solution of simply changing
fonts no longer works. 

Have you considered using a statistical character set detector?
A quick google search reveals http://trific.ath.cx/software/enca/,
but if the license isn't acceptable, there are probably others out there.

This would permit high confidence decisions about when to convert an
incoming message, although after conversion the problems you described
would still exist, if the detection was incorrect.

-- 
Laird Breyer.