nmh-workers
[Top] [All Lists]

Re: mhfixmsg character set conversion

2022-02-14 21:00:20
Steven writes:

we could decode ASCII because 1) we've seen it in the wild, 2) it seems as
harmless as it is pointless to encode ASCII as ASCII, assuming no NULs,
and 3) it's a proper subset of UTF-8 so it doesn't interfere with the
semantics of the "-decodeheaderfieldbodies utf8" switch.

That also makes sense.

Ok, I'll add support for it to mhfixmsg -decodeheaderfieldbodies utf8.

This line is too long, I'm not sure if that is related or if it's a
separate issue:

It's probably related.  I can't prove that, but in general, shorter subject
lines appear to be passed through without encoding.

Regardless, this kind of thing is exactly what I'm trying to eliminate in
my saved messages.  I just realized that my decode_headers program doesn't
detect the second encoded string in the same header, but I'm about to go
fix that. :-)

When I look at the message in the lists.nongnu.org archive [1], the
line isn't too long.  But it's not folded, either.  The continuation
is on separate line with no leading whitespace.  So I would expect
some message parsers, including nmh's, to not detect it.

David

[1] https://lists.nongnu.org/archive/html/nmh-workers/2022-02/msg00122.html

<Prev in Thread] Current Thread [Next in Thread>