Hi,
I got an email recently, probably spam, its charset is gb2312.
$ mhlist
msg part type/subtype size description
8032 text/plain 10K
charset="gb2312"
$
1.6's mhshow(1) says
mhshow: unable to convert character set to gb2312, continuing...
But I mhstore(1)'d it and used iconv(1) and that was happy so I dug.
I took a look at mhshowsbr.c's convert_charset() and I think it's
failing to handle an EINVAL return. inbytes and outbytes both start at
8Ki. That's E2BIG having processed 5,646 of in and 8,191 of out. After
the bump we attempt to continue, 2,546 of in remaining, and 5,093 of
out. That's EINVAL with in now at 8,191, out 11,880. I think the
two-byte rune is straddling the 8Ki boundary. I've annotated this with
commas between runes.
$ mhstore -outfile - 8032 | hd | grep -B 1 2000
storing message 8032 to stdout
00001ff0 d0,c1 a6,0a,0a,b0 cb,a1 a2,b4 d3,bc bc,ca f5,d7
|................|
00002000 df,cf f2,b9 dc,c0 ed,b5 c4,cb c4,b8 f6,ba cb,d0
|................|
$
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers