Note, those that use TEXTENCODE should not encounter this
performance problem since char->html conversion is vastly
simplified and should avoid the routine(s) in question here.
This assertion is made on my understanding of the code versus
any actual testing. Those using TEXTENCODE are free to
run tests and report their results back to the dev list.
one BIG letter
$ ls -l
-rw-r--r-- 1 andrews andrews 29197818 Jul 19 18:42 mbox.200410.one
$ wc mbox.200410.one
1199719 2958850 29197818 mbox.200410.one
1st way:
<TextEncode>
utf-8; MHonArc::UTF8::to_utf8; MHonArc/UTF8.pm
</TextEncode>
IN at /usr/share/mhonarc/MHonArc/CharEnt.pm line 178, <STDIN> line 22.
OUT at /usr/share/mhonarc/MHonArc/CharEnt.pm line 199, <STDIN> line 22.
IN at /usr/share/mhonarc/MHonArc/CharEnt.pm line 178, <STDIN> line 1199719.
OUT at /usr/share/mhonarc/MHonArc/CharEnt.pm line 199, <STDIN> line 1199719.
IN at /usr/share/mhonarc/MHonArc/CharEnt.pm line 178, <STDIN> line 1199719.
Out of memory!
Maximum memory usage (1.2Gb) at
/usr/share/mhonarc/MHonArc/CharEnt.pm line 178
$$data_r =~ s{
$utf8_re_lax
}{
$char = unpack('U0U*',$1);
if ($malformed ||
(($char & 0xFFFE) == 0xFFFE) ||
(($char & 0xFFFF) == 0xFFFF) ||
($char >= 0xFDD0 && $char <= 0xFDEF) ||
($char >= 0xD800 && $char <= 0xDFFF)
) {
# Some of the if() checks may be handled by perl directly,
# but such checks can be disabled when perl is built.
$malformed = 0;
'�';
} else {
($char <= 0x7F)
? $HTMLSpecials{$1} || sprintf('%c',$char)
: sprintf('&#x%X;',$char);
}
}gxeso;
------------------------------------------------
2nd way:
<TextEncode>
koi8-r; MHonArc::Encode::from_to; MHonArc/Encode.pm
</TextEncode>
IN at /usr/share/mhonarc/MHonArc/Char.pm line 88, <STDIN>
line 1199719.
Out of memory!
Maximum memory usage (1.2Gb) at
/usr/share/mhonarc/MHonArc/Char.pm line 89
$$data_r =~ s{
([\x00-\xFF])
}{
foreach $map (@maps) {
$char = $map->{$1};
last if defined($char);
}
unless (defined($char)) {
$char = (ord($1) <= 0x7F) ? $1 : '?';
}
$char;
}gxe;
---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV