nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] General question - unsupported charset conversion

2014-02-28 12:49:30
Unfortunately, I have a lot of experience and troubles with character
set conversion. 

Well, if you just bit the bullet and switched to UTF-8, you wouldn't have
all of these problems! :-)

Should we return the original bytes?  

It is not the best idea. Some sequences of bytes are control sequences
for terminal. This sometimes set terminal in unusable state.

Seems fine to me.

An error? [..]  Some string which says, "We cannot convert
klingon-8842 to us-ascii" or the equivalent?


In practice it means a spam in exotic language and at this point I know
that I do not want to read such a message. 

I can see that, but I'm not sure that's an appropriate choice for all
cases (like, for instance, MIME parameters).

- What to do when we cannot convert a particular character.  This is a
little more clear; the general trend is to use a substitution
character.

This is very frequent and causes a lot of troubles. Entire message in
English and one foreign family name in original. Message send in utf-8
but (suppose) my terminal support only ASCII. Converison would fail. 

Errr ... really?  In the case I'm thinking, the one foreign family
name would have the offending character output as a '?' (or whatever).
The conversion would go through fine.

In my personal opinion a very good choice is conversion into
html-entities, like ą or ł . It remains quite readable and
is still unique enough to convert it back in case of need.

Um, ouch.  Unless there's a common library that already implements
that behavior, that's not on the table at all.

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>