nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] General question - unsupported charset conversion

2014-02-28 12:38:13
Ken Hornstein writes:

I've been grappling with to do when we have issues with character set
conversion.  

Unfortunately, I have a lot of experience and troubles with character
set conversion. 

Specifically, I have two issues:

- What to do if the character set is unsupported.

Should we return the original bytes?  

It is not the best idea. Some sequences of bytes are control sequences
for terminal. This sometimes set terminal in unusable state.

An error? [..]  Some string which says, "We cannot convert
klingon-8842 to us-ascii" or the equivalent?


In practice it means a spam in exotic language and at this point I know
that I do not want to read such a message. 

In rare cases when I want to read in charset unsupported by 
configuration this is advantage of mh system that it is possible to
handle it separately. Save, decode, convert .. whatever.


- What to do when we cannot convert a particular character.  This is a
little more clear; the general trend is to use a substitution
character.

This is very frequent and causes a lot of troubles. Entire message in
English and one foreign family name in original. Message send in utf-8
but (suppose) my terminal support only ASCII. Converison would fail. 

I can prepare an example but including it into this message can make it
difficult to read.

In my personal opinion a very good choice is conversion into
html-entities, like ą or ł . It remains quite readable and
is still unique enough to convert it back in case of need.

        max


_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>