nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages

2013-10-24 09:30:49
Thus spake Ralph Corderoy:
Hi Joel,

I think I found something related to the cause: I have this line in my
mhn.defaults:

  mhshow-show-text/html: %p/usr/bin/lynx -force_html -dump '%f' | less

I have a similar single line.

    mhshow-show-text/html: lynx -dump -width `tput cols` '%F' |
        expand | sed 's/  *$//' | cat -s | less

I find it gives the behaviour you describe if the HTML file contains a
charset declaration that's incorrect for the content of the HTML, e.g.
ISO-8859-1 when mhstore shows the glyph is UTF-8 encoded.  Lynx obeys
the charset in the file.  Might be worth searching for `charset' in the
HTML you've got.  I receive many such broken HTML emails.

The MIME part has this:

  Content-Type: text/html; charset=UTF-8

The HTML in the MIME part has *no* encoding declaration, e.g.:

  <?xml version="1.0" encoding="UTF-8"?>

Does one of the MIME RFCs require an encoding declaration when the
charset is already given in the MIME header? If so, then virtually
every email in my inbox which has a text/html part is wrong:

[uckelman@charybdis inbox]$ grep -l 'text/html' [0-9]* | wc -l
74
[uckelman@charybdis inbox]$ grep -l 'encoding="' [0-9]* | wc -l
1

-- 
J.

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>