If we were to use $LANG/$LC_CTYPE to convert incoming data to UTF-8
in the same manner, and process (and store!) everything internally
as UTF-8, all of this nonsense would go away. Similarly, we could
convert from UTF-8 -> $LANG/$LC_CTYPE on the way out. And we could ship
everything off-site with one of only two character sets: ascii, or utf8.
I ... do not think this would solve this particular problem. The issue
here seems to be a) nmh programs were given 8 bit characters, and b)
the locale was set to US-ASCII. If you are going to assume that all
INPUT is unconditionally UTF-8, then yes, that would solve this problem.
But you say above you want to use LANG/LC_CTYPE to convert to UTF-8 on
input; that would have failed given the problem as stated.
And like I've said before: I think this effort would a) require a new
library dependency (for UTF-8 processing, since we couldn't use the
locale functions anymore) and b) result in no gain in functionality.
Like, I'm squinting really hard here, and I can't see how it would have
changed anything. And last time we discussed this, people screamed at
the thought of assuming UTF-8 for input; I interpreted that suggestion
as a non-starter.
--Ken
_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers