nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] base64 ... just looking for advice

2016-01-23 16:18:12
Dear David,

In message 
<a18f4c3e-ca30-4403-a426-1ebd46da9e49(_at_)HUBCAS2(_dot_)seas(_dot_)wustl(_dot_)edu>
 you wrote:

and 202 (  given, the text/plain parts is apparently UTF8,
but the whole message remains iso-8859-1).

I'm not sure about that, for these reasons:

Well, at least the HTML part is still marked as charset="iso-8859-1"
and Content-Transfer-Encoding: 8bit, and indeed it contains ISO-8859-1
characters.

That shows part 2 as utf8.  mhlist just echos the charset in the
Content-Type header of the part, but it does indicate that mhfixmsg
attempted to do what we want.

Agreed - but this leaves us with a problem; as we now have a single
file with different parts in different character sets.

In both cases, f=FCr is properly represented by the byte sequence:
66 c3 bc 72.  (I'm using GNU od and grep on a little endian machine.)

This is true for the plain/text part.

But the HTML part was also decoded, and the "f=FCr" was translated
into a single 0xFC character, which is the proper ISO-8859-1
representation of the UTF-8 character 0xC3 0xBC.

4) If I remove the html content from file 202, vim properly handles it.
(And the BSD file command on Linux and emacs behave analogously.)

Same here.

So, I think this a problem with vim (and other tools).

What should vim do if we give it something that is supposed to be a
text file, but then it contains a mix of UTF-8 and ISO-8859-1
characters?  I think this is bound to fail.

Either fixmhmsg should not touch the HTML part at all, i. e. leave it
quoted-printable encoded as it was in the original message,  or, if it
decodes it, it should use the same character set for all parts.

Using two different character sets for parts in a single message is
somehow bound to break...

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd(_at_)denx(_dot_)de
Der Horizont vieler Menschen ist ein Kreis mit Radius Null --
und das nennen sie ihren Standpunkt.

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>