Wolfgang wrote:
I've constructed a test message that shows the issue. The original
test message is "200", and my attempts to convert it (the text part)
are 201 (no specified, thus charset="iso-8859-1"; file is OK, but I
want UTF8)
Right, per my previous message, -notextcharset is the default
so there will be no character set conversion.
and 202 ( given, the text/plain parts is apparently UTF8,
but the whole message remains iso-8859-1).
I'm not sure about that, for these reasons:
1) mhlist output:
202 multipart/alternative 1125
boundary="_000_FC5FFED413B51B4FBC991F3F78A1D46A29956C9CUK5EXCMBP
R11s3m_"
1 text/html 601
charset="iso-8859-1"
2 text/plain 139
charset="utf8"
That shows part 2 as utf8. mhlist just echos the charset in the
Content-Type header of the part, but it does indicate that mhfixmsg
attempted to do what we want.
2) When I cat the 202 file, I see the Umlaut in the text/plain part.
3) If I look at the bytes with in both 202 and the stored content
of the second part, which vim renders properly, they're identical:
$ od -ax 202 | grep --after-context=1 --max-count=1 'f C'
0001020 h a r a c t e r s : nl nl f C < r
6168 6172 7463 7265 3a73 0a0a c366 72bc
$ mhstore -file 202 -type text/plain && \
od -ax 202.2.txt | grep --after-context=1 --max-count=1 'f C'
storing message /home/levine/src/nmh/202 part 2 as file 202.2.txt
0000060 r a c t e r s : nl nl f C < r sp M
6172 7463 7265 3a73 0a0a c366 72bc 4d20
In both cases, f�r is properly represented by the byte sequence:
66 c3 bc 72. (I'm using GNU od and grep on a little endian machine.)
4) If I remove the html content from file 202, vim properly handles it.
(And the BSD file command on Linux and emacs behave analogously.)
So, I think this a problem with vim (and other tools).
David
_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers