nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] error: mhshow: unable to convert character set

2014-06-21 00:18:46
Complete error: mhshow: unable to convert character set of part 1 to
windows-1252, continuing...

Okay, so ... first off, that error is lousy.  Yes, it's backwards; converting
FROM windows-1252 is what it means.  Yes, we know about that.

It's failing because one of the characters in that message is 0x8F (=8F).
That is not a valid character in windows-1252.  The real mess is this:

You wrote 17 =D0=B8=D1=8E=D0=BD=D1=8F 2014 =D0=B3.

So, what happened?  Well, I believe this is UTF-8.  Specifically, this
looks like:

D0 B8: U+0438 (и)
D1 8E: U+044E (ю)
D0 BD: U+043D (н)
D1 8F: U+044F (я)
D0 B3: U+0433 (г)

So the correct line should have been:

You wrote 17 июня 2014 г.

Does that look right?

Somewhere along the way those characters were mangled by sender's
MUA (Mozilla, it looks like) and the UTF-8 bytes were marked as
windows-1252, leading to the current mess (it looks like none of those
characters exist in windows-1252).  So, that was pretty bogus.  Also
pretty bogus is nmh simply aborting; I think this is another data
point that says for every character set conversion we need to put in
substitution characters; aborting is simply not reasonable.  If people
disagree, let's hear the reasons (the character set conversion in nmh
is pretty scattershot; it all needs to be incorporated into a library
routine so the handling is uniform).  Thoughts from others?

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>