nmh-workers
[Top] [All Lists]

Re: Bug reported regarding Unicode handling in email address

2021-06-11 19:50:39
And then, to get back to my original point ... if we see an 8-bit
character that is not valid in the current character set, what,
exactly, should we do about it?

Complain precisely, e.g. pathname, line number, column, encoding
expected, byte(s) seen.  I'd expect an nmh user to want to understand
how the parts of their system work and where something has gone wrong
and a good error message will help diagnose problems rather than just
passing duff data on so it causes problems further away from the origin.

Well ... I am not sure this feeling is universal:

https://lists.nongnu.org/archive/html/nmh-workers/2014-04/msg00213.html
https://lists.nongnu.org/archive/html/nmh-workers/2015-03/msg00045.html

I'm kind of close to this camp; I don't think we should really emit that
many warnings.

And ... well, reality, again, rears it's ugly head.  For one, we don't
really get a notification down at the address parser WHERE we are at.
So emitting a warning isn't really practical.  Also if are dealing with
it in the format engine, well ... who knows where it is coming from,
exactly?  As far as I can tell, this would result in potentially a lot
of confusing warnings that would drive most people nuts.

I think, mostly, we kind of get this right ... we try to make sure
that the source characters are converted using iconv() to the native
character set at display time (we probably don't get the case right
where raw UTF-8 appears in headers and you're in the C locale, because
we are probably assuming that's ASCII).  In that case the substitution
character should appear.  I'm open to adding code to emit more warnings,
but not turned on by default, and I honestly think it would be more
trouble than it is

--Ken

<Prev in Thread] Current Thread [Next in Thread>