nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages

2013-10-24 08:14:27
The munged character in your fist example looks like it's
supposed to be c3 bc c3, but instead is 83 c2 bc, if I did
that right.  It takes more than one step to get from here to
there, such as losing bits and wrong endian?

Actually, I think Joel was trying to say "für", which has the middle
letter as an lowercase "u" with umlaut.  That would be U+00FC, which has
a UTF-8 encoding of C3 BC.  The characters he sees are Ã, uppercase A
with tilde, U+00C3, and ¼, vulgar fraction one quarter, U+00BC.

C3 is à in ISO-8859-1, and BC is ¼ in ISO-8859-1; something is clearly
interpreting the UTF-8 bytes as ISO-8859-1.  But since your locale and
the message are both UTF-8, this doesn't feel like an nmh problem to
me.  If you just saw the unencoded quoted-printable, yeah, that would
probably be us.  But you're seeing the correct bytes; something in your
display path isn't doing the right thing.

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>