mhonarc-dev

[bug #14747] major (10X) memory savings possible in some situations

2006-07-19 08:47:15

Note, those that use TEXTENCODE should not encounter this
performance problem since char->html conversion is vastly
simplified and should avoid the routine(s) in question here.
This assertion is made on my understanding of the code versus
any actual testing.  Those using TEXTENCODE are free to
run tests and report their results back to the dev list.

one BIG letter
$ ls -l
-rw-r--r--  1 andrews andrews 29197818 Jul 19 18:42 mbox.200410.one
$ wc mbox.200410.one
 1199719  2958850 29197818 mbox.200410.one

1st way:

<TextEncode>
utf-8; MHonArc::UTF8::to_utf8; MHonArc/UTF8.pm
</TextEncode>

IN  at /usr/share/mhonarc/MHonArc/CharEnt.pm line 178, <STDIN> line 22.
OUT at /usr/share/mhonarc/MHonArc/CharEnt.pm line 199, <STDIN> line 22.
IN  at /usr/share/mhonarc/MHonArc/CharEnt.pm line 178, <STDIN> line 1199719.
OUT at /usr/share/mhonarc/MHonArc/CharEnt.pm line 199, <STDIN> line 1199719.
IN  at /usr/share/mhonarc/MHonArc/CharEnt.pm line 178, <STDIN> line 1199719.
Out of memory!

Maximum memory usage (1.2Gb) at
/usr/share/mhonarc/MHonArc/CharEnt.pm line 178

        $$data_r =~ s{
            $utf8_re_lax
        }{
            $char = unpack('U0U*',$1);
            if ($malformed ||
                  (($char & 0xFFFE) == 0xFFFE) ||
                  (($char & 0xFFFF) == 0xFFFF) ||
                  ($char >= 0xFDD0 && $char <= 0xFDEF) ||
                  ($char >= 0xD800 && $char <= 0xDFFF)
               ) {
                # Some of the if() checks may be handled by perl directly,
                # but such checks can be disabled when perl is built.
                $malformed = 0;
                '&#xFFFD;';
            } else {
                ($char <= 0x7F)
                        ? $HTMLSpecials{$1} || sprintf('%c',$char)
                        : sprintf('&#x%X;',$char);
            }
        }gxeso;

------------------------------------------------
2nd way:

<TextEncode>
koi8-r; MHonArc::Encode::from_to; MHonArc/Encode.pm
</TextEncode>

IN  at /usr/share/mhonarc/MHonArc/Char.pm line 88, <STDIN>
line 1199719.
Out of memory!

Maximum memory usage (1.2Gb) at
/usr/share/mhonarc/MHonArc/Char.pm line 89
    $$data_r =~ s{
        ([\x00-\xFF])
    }{
        foreach $map (@maps) {
            $char = $map->{$1};
            last  if defined($char);
        }
        unless (defined($char)) {
            $char = (ord($1) <= 0x7F) ? $1 : '?';
        }
        $char;
    }gxe;



---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV

<Prev in Thread] Current Thread [Next in Thread>