This mail is an automated notification from the bugs tracker
of the project: MHonArc.
/**************************************************************************/
[bugs #11187] Full Item Snapshot:
URL: <http://savannah.nongnu.org/bugs/?func=detailitem&item_id=11187>
Project: MHonArc
Submitted by: Egmont Koblinger
On: Thu 12/02/04 at 00:04
Category: Character Sets
Severity: 5 - Average
Item Group: Incorrect Behavior
Resolution: None
Privacy: Public
Assigned to: None
Status: Open
Platform Version: Linux
Perl Version: 5.8.5
Component Version: 2.6.10
Fixed Release:
Summary: incorrectly parsing UTF-8 encoded messages
Original Submission: I use mhonarc without any configuration file, just simply
the command "mhonarc -outdir outdir indir" whereas "indir"
only contains one file with one single message encoded in
UTF-8. (Both the subject and the body contain UTF-8 encoded
accented letters, the subject uses quoted-printable, the
body's transfer encoding is 8-bit).
The output html files are quite strange. For each UTF-8
byte sequence only the first byte is taken into account
and it is converted to a html escape. For example, the
Euro sign (U+20AC, UTF-8: E2 82 AC) will appear in the html
output as "&#E2;" and then 82 and AC are skipped, processing
goes on with the next Unicode character.
In MHonarc/CharEnt.pm line 153 there's a switch to check
whether perl is new enough to support UTF-8. If it isn't,
then manual processing of UTF-8 character takes place.
Forcing the "non-UTF-8-aware perl" branch of the "if"
statement (that is, changing the "if ($] >= 5.006)" to
"if (0)" repairs the problem, in this case the output will
be the expected "AC;".
I don't think it matters, but I have LANG=hu_HU (latin2
locale) and no other LC_* variables set. However, UTF-8
locales are also available on my system.
For detailed info, follow this link:
<http://savannah.nongnu.org/bugs/?func=detailitem&item_id=11187>
_______________________________________________
Message sent via/by Savannah
http://savannah.nongnu.org/
---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV