nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] scan or show of UTF-encoded headers?

2005-02-14 13:39:45
On Mon, 14 Feb 2005 19:35:36 +0100, Harald Geyer said:

Obviously any script which tries to do the above runs into the same
problem that prevents nmh from doing it itself: The script would need
to know which charsets the terminal can handle and how to tell it.
Also changing the terminal might confuse other programs.

I guess it would be much easier und less prone to error to just
implement transcoding of messages through iconv instead of trying
to adapt the display on a per message basis.

In general, you *can't* do a good job of using iconv to mash things between
the various iso8859-* charsets.  There *will* be lossage - after all, there
is a *reason* they're up to -15, namely that one isn't sufficient.  So whichever
one you're in, there *will* be lossage for the other 14.

On the flip side, it's possible to do lossless conversion *from* any 8859-*
into the UTF-8 space.  So teaching the code that currently does MM_CHARSET
that if the user is in a UTF-8 environ, it should use iconv to convert 8859
to utf-8 is a better solution.

And yes, it's possible that the user is in a utf-8 environment, but doesn't
have actual font glyghs for all the planes (so, for instance Hebrew or
Cyrillic characters don't display).  This is actually a non-issue, for 2 
reasons:

1) If they don't have the Hebrew glyghs installed, there's nothing you could
have done anyhow.

2) On the other hand, it's fairly safe to assume that if they're in a UTF-8
locale, that their software has at least enough smarts to put up a "unknown
character" box at that position.

I remember the gnus people using big sets of tables to do a mixture
of transcoding and unifying between character sets which led to
messages being split into several parts of different character sets,
when it didn't work correctly. I don't know what had been their reason
to not use iconv.

At least in the MULE-ized versions of Emacs and XEmacs, the basic reason for
the big sets of tables is because they're using their own internal encoding
instead of UTF-mumble (which is also why they couldn't use iconv).  As a
result, the big tables are visible to you.  If it used iconv instead, the big
tables are still there - just hidden off in /usr/lib/iconv where you don't
usually see them.

Attachment: pgpbb7rCNvK2q.pgp
Description: PGP signature

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
http://lists.nongnu.org/mailman/listinfo/nmh-workers
<Prev in Thread] Current Thread [Next in Thread>