If you are going to assume that all
INPUT is unconditionally UTF-8, then yes, that would solve this problem.
But you say above you want to use LANG/LC_CTYPE to convert to UTF-8 on
input; that would have failed given the problem as stated.
Two problems:
1) original input from the nmh user (composition). What I described works.
2) deciphering any external unlabeled content. this cannot be done reliably. as
others have said, punt.
3) output well formed content. if we have utf8 internally, we can *always* do
that (according to the locale()).
And like I've said before: I think this effort would a) require a new
library dependency (for UTF-8 processing, since we couldn't use the
locale functions anymore)
I can (have already) import that from plan9port.
and b) result in no gain in functionality.
And last time we discussed this, people screamed at
the thought of assuming UTF-8 for input; I interpreted that suggestion
as a non-starter.
But we aren't. I am saying UTF-8 is the native internal character set. What
happens at the boundaries becomes everyone else's problem. And after all the
grief in this discussion over the last five+ years, don't you think it should
be someone else's problem?
_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers