perl-unicode

CJK benchmarks

2002-02-17 15:20:35
Folks,

I wrote a quick and dirty benchmark code to test various encodings. The result was amasing. I compared Encode::JP and Encode::Tcl and found Encode::Tcl is some 1000 times slower! Encode::Tcl appears so slow that I don't even want to try more than 10 counts. This script will be included in the next batch (unless jhi choose to do so in the next breadperl).

Dan

> perl5.7.2 t/benchmark.pl JP 1000
Benchmark: timing 1000 iterations of JP decode(7bit-jis), JP decode(euc-jp), JP encode(7bit-jis), JP encode(euc-jp)... JP decode(7bit-jis): 14 wallclock secs (13.44 usr + 0.00 sys = 13.44 CPU) @ 74.42/s (n=1000) JP decode(euc-jp): 2 wallclock secs ( 2.39 usr + 0.00 sys = 2.39 CPU) @ 418.30/s (n=1000) JP encode(7bit-jis): 18 wallclock secs (17.55 usr + 0.00 sys = 17.55 CPU) @ 56.96/s (n=1000) JP encode(euc-jp): 3 wallclock secs ( 2.78 usr + 0.00 sys = 2.78 CPU) @ 359.55/s (n=1000)
> perl5.7.2 t/benchmark.pl Tcl 10
Benchmark: timing 10 iterations of Tcl decode(7bit-jis), Tcl decode(euc-jp), Tcl encode(7bit-jis), Tcl encode(euc-jp)... Tcl decode(7bit-jis): 3 wallclock secs ( 3.41 usr + 0.00 sys = 3.41 CPU) @ 2.93/s (n=10)
Loading /usr/home/dankogai/work/perl/ext/Encode/blib/lib/Encode/euc-
jp.enc at /usr/home/dankogai/work/perl/ext/Encode/blib/lib/Encode.pm line 242 Tcl decode(euc-jp): 2 wallclock secs ( 1.73 usr + 0.00 sys = 1.73 CPU) @ 5.79/s (n=10) Tcl encode(7bit-jis): 30 wallclock secs (30.10 usr + 0.00 sys = 30.10 CPU) @ 0.33/s (n=10) Tcl encode(euc-jp): 25 wallclock secs (25.20 usr + 0.00 sys = 25.20 CPU) @ 0.40/s (n=10)

# t/benchmark.pl
# perl5.7.2 t/benchmark.pl SUBMODULE COUNT
use strict;
use blib;
use Benchmark;

use Encode;
my $mod = shift or die;
eval "use Encode::$mod;"; $@ and die "$mod:$@";
my $count = shift || 16;

my $utf8_str = swallow("t/table.utf8");
Encode::_utf8_on($utf8_str);
my $euc_str  = swallow("t/table.euc");
my $jis_str  = encode("7bit-jis", $utf8_str);

timethese($count, {
    "$mod encode(euc-jp)" => sub {
        my $dummy = encode("euc-jp", $utf8_str);
        $dummy eq $euc_str or die;
    },
    "$mod decode(euc-jp)" => sub {
        my $dummy = decode("euc-jp", $euc_str);
        $dummy eq $utf8_str or die;
    },
    "$mod encode(7bit-jis)" => sub {
        my $dummy = encode("7bit-jis", $utf8_str);
        $dummy eq $jis_str or die;
    },
    "$mod decode(7bit-jis)" => sub {
        my $dummy = decode("7bit-jis", $jis_str);
        $dummy eq $utf8_str or die;
    },
});


sub swallow{
    my $fn = shift;
    open my $fh, $fn or die "$fn : $!";
    read $fh, my $result, -s $fn;
    return $result;
}
__END__

<Prev in Thread] Current Thread [Next in Thread>
  • CJK benchmarks, Dan Kogai <=