nmh-workers
[Top] [All Lists]

Re: Bug reported regarding Unicode handling in email address

2021-06-10 05:39:31
Hi Ken,

The address parser code is used for a lot of things.  The specific
bug report was about a draft message that contained Cyrillic
characters.  We know what that character set was in THAT case,
because it's a draft message and we can derive the locale from the
environment or the nmh locale setting.  But if we are processing
an email message then we don't easily know the character set.  In
theory it should either be us-ascii or utf-8, but reality
sometimes intrudes and it could be anything.

If it's an email then won't it be ASCII?

Boy, you're out of the loop!  Check out RFC 6532.

Oh, SMTPUTF8, yes I've seen that around.  :-)

But my point stands.  nmh should know from the context where the email
address appears what encoding the bytes use when trying to parse it.

- mail/inbox/42 was written by us; it's our choice.
- mail/draft is the process's locale.
- /var/spool/$LOGNAME is in UTF-8.

-- 
Cheers, Ralph.

<Prev in Thread] Current Thread [Next in Thread>