On July 19, 2006 at 18:44, Andrew Shirrayev wrote:
one BIG letter
$ ls -l
-rw-r--r-- 1 andrews andrews 29197818 Jul 19 18:42 mbox.200410.one
$ wc mbox.200410.one
1199719 2958850 29197818 mbox.200410.one
1st way:
<TextEncode>
utf-8; MHonArc::UTF8::to_utf8; MHonArc/UTF8.pm
</TextEncode>
Did you also set:
<-- With data translated to UTF-8, it simplifies CHARSETCONVERTERS -->
<CharsetConverters override>
default; mhonarc::htmlize
</CharsetConverters>
<-- Need to also register UTF-8-aware text clipping function -->
<TextClipFunc>
MHonArc::UTF8::clip; MHonArc/UTF8.pm
</TextClipFunc>
If you use TEXTENCODE, you can avoid dealing with MHonArc::CharEnt
with the above CHARSETCONVERTERS. Without the above, MHonArc will
convert all non-ASCII UTF-8 sequences into entity references.
In general, if you use TEXTENCODE, you should also redefine
CHARSETCONVERTERS appropriately.
--ewh
---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV