MHonArc Users,
If you are not interested in performance issues with MHonArc, then
just delete this message. If you are interested, read on.
I ran Devel::DProf on MHonArc to generate some performance stats.
The tests were executed on a PentiumII 350Mhz 128MB RAM running RedHat
Linux 5.2 (with an upgraded kernel: 2.2.9). Note, RAM is not an issue
unless there is no more free RAM available and swapping must occur.
The main run was on the MHonArc list archives ranging from 1995-12
to 1999-05, totaling 2963 messages with the upcoming v2.4.0 of MHonArc.
The following run uses the default settings of MHonArc:
Total Elapsed Time = 166.8272 Seconds
User Time = 156.9349 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
25.7 40.43 80.087 120873 0.0003 0.0007 mhonarc::replace_li_var
11.9 18.66 19.369 2992 0.0062 0.0065 m2h_text_plain::filter
6.66 10.45 10.182 79383 0.0001 0.0001 mhonarc::compute_msg_pos
6.26 9.827 9.819 3030 0.0032 0.0032 readmail::MAILread_file_header
6.08 9.537 9.529 3194 0.0030 0.0030 readmail::MAILread_header
5.52 8.668 12.971 62718 0.0001 0.0002 readmail::MAILdecode_1522_str
5.36 8.404 63.535 2963 0.0028 0.0214 mhonarc::output_mail
4.61 7.227 6.858 106405 0.0001 0.0001 mhonarc::urlize
3.76 5.897 28.851 2974 0.0020 0.0097 mhonarc::htmlize_header
3.24 5.088 4.347 212227 0.0000 0.0000 mhonarc::get_time_from_index
3.05 4.787 12.587 11355 0.0004 0.0011 mhonarc::mailUrl
2.63 4.134 8.327 3 1.3779 2.7758 mhonarc::sort_messages
2.07 3.249 2.990 74187 0.0000 0.0000 mhonarc::htmlize
1.83 2.879 2.441 125493 0.0000 0.0000 readmail::load_charset
1.71 2.679 2.526 43899 0.0001 0.0001 mhonarc::exclude_field
Only the date and thread indexes were generated.
mhonarc::replace_li_var took the most total time, however each call
is fairly quick. Since mhonarc::replace_li_var evaluates resource
variables in resources, there is very little that can be done to reduce
the number of calls to the function. If you think it can be in-lined,
that would be very difficult; mhonarc::replace_li_var is quite lengthy
(see mhrcvars.pl).
Now, the following runs are to test the improvements in subject and
author sorting in v2.4.0 from v2.3.3. The sort tests were done
on the archive created in the main test by using the -scan option
with the appropriate sorting option (either -subsort or -authsort).
Subject Sorting (-scan -subsort): v2.3.3
Total Elapsed Time = 6.289467 Seconds
User Time = 5.909467 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
44.0 2.604 2.738 1 2.6038 2.7377 mhonarc::sort_messages
24.8 1.470 1.740 3 0.4900 0.5799 mhonarc::BEGIN
13.0 0.770 0.758 2963 0.0003 0.0003 mhonarc::time2mmddyy
5.42 0.320 0.308 2963 0.0001 0.0001 mhonarc::extract_email_name
5.35 0.316 4.120 1 0.3163 4.1202 mhonarc::scan
2.71 0.160 0.134 6542 0.0000 0.0000 mhonarc::get_time_from_index
2.03 0.120 0.120 1 0.1200 0.1200 mhonarc::get_last_msg_num
0.85 0.050 0.050 2 0.0250 0.0250 AutoLoader::AUTOLOAD
0.85 0.050 0.070 1 0.0500 0.0698 mhonarc::initialize
0.34 0.020 0.060 10 0.0020 0.0060 Getopt::Long::BEGIN
0.17 0.010 0.010 1 0.0100 0.0100 mhonarc::create_routines
0.17 0.010 0.010 1 0.0100 0.0100 mhonarc::mhinit_vars
0.17 0.010 0.010 5 0.0020 0.0020 Exporter::export
0.17 0.010 0.010 6 0.0017 0.0017 vars::import
0.17 0.010 0.010 3 0.0033 0.0033 Getopt::Long::FindOption
Subject Sorting (-scan -subsort): v2.4.0
Total Elapsed Time = 3.839134 Seconds
User Time = 3.699134 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
40.0 1.480 1.689 10 0.1480 0.1689 mhonarc::BEGIN
19.1 0.710 0.695 2963 0.0002 0.0002 mhonarc::time2mmddyy
11.2 0.417 0.625 1 0.4173 0.6246 mhonarc::sort_messages
9.46 0.350 0.335 2963 0.0001 0.0001 mhonarc::extract_email_name
7.84 0.290 1.945 1 0.2904 1.9453 mhonarc::scan
6.49 0.240 0.207 6542 0.0000 0.0000 mhonarc::get_time_from_index
2.97 0.110 0.110 1 0.1100 0.1100 mhonarc::get_last_msg_num
1.89 0.070 0.080 1 0.0700 0.0797 mhonarc::initialize
1.35 0.050 0.050 2 0.0250 0.0250 AutoLoader::AUTOLOAD
0.81 0.030 0.050 10 0.0030 0.0050 Getopt::Long::BEGIN
0.27 0.010 0.010 1 0.0100 0.0100 mhonarc::create_routines
0.00 0.000 -0.000 1 0.0000 - AutoLoader::import
0.00 0.000 -0.000 4 0.0000 - Exporter::export
0.00 0.000 -0.000 1 0.0000 - Getopt::Long::ConfigDefaults
0.00 0.000 -0.000 3 0.0000 - Exporter::import
[NOTE: Apparently Devel::DProf does not list the routine calls of
sort compare routines passed to Perl's sort(), but it does list
any functions called within the compare routine. The culprit in
the slower performance in v2.3.3 (and earlier versions) is
how the comparison function does its job. ]
Once can see that the speed increase is quite significant. This
will be beneficial for users that generate subject indexes.
Author Sorting (-scan -authsort): v2.3.3
Total Elapsed Time = 10.85953 Seconds
User Time = 10.69957 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
56.6 6.058 5.777 70699 0.0001 0.0001 mhonarc::extract_email_name
20.1 2.154 8.278 1 2.1538 8.2777 mhonarc::sort_messages
13.4 1.440 1.680 3 0.4800 0.5599 mhonarc::BEGIN
7.76 0.830 0.818 2963 0.0003 0.0003 mhonarc::time2mmddyy
6.54 0.700 0.595 26294 0.0000 0.0000 mhonarc::get_time_from_index
2.86 0.306 9.650 1 0.3063 9.6502 mhonarc::scan
1.03 0.110 0.110 1 0.1100 0.1100 mhonarc::get_last_msg_num
0.47 0.050 0.050 2 0.0250 0.0250 AutoLoader::AUTOLOAD
0.47 0.050 0.070 1 0.0500 0.0698 mhonarc::initialize
0.19 0.020 0.020 2 0.0100 0.0100 lib::BEGIN
0.19 0.020 0.040 10 0.0020 0.0040 Getopt::Long::BEGIN
0.09 0.010 0.010 1 0.0100 0.0100 mhonarc::create_routines
0.09 0.010 0.010 5 0.0020 0.0020 Exporter::export
0.09 0.010 0.010 1 0.0100 0.0100 mhonarc::mhinit_vars
0.09 0.010 0.010 2 0.0050 0.0050 Time::Local::BEGIN
Author Sorting (-scan -authsort): v2.4.0
Total Elapsed Time = 4.700067 Seconds
User Time = 4.565324 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
31.7 1.450 1.689 10 0.1450 0.1689 mhonarc::BEGIN
17.9 0.818 1.657 1 0.8183 1.6567 mhonarc::sort_messages
16.6 0.760 0.747 2963 0.0003 0.0003 mhonarc::time2mmddyy
15.7 0.720 0.602 26294 0.0000 0.0000 mhonarc::get_time_from_index
11.1 0.510 0.483 5926 0.0001 0.0001 mhonarc::extract_email_name
8.17 0.373 3.023 1 0.3733 3.0233 mhonarc::scan
2.63 0.120 0.120 1 0.1200 0.1200 mhonarc::get_last_msg_num
1.31 0.060 0.080 1 0.0600 0.0797 mhonarc::initialize
1.10 0.050 0.050 2 0.0250 0.0250 AutoLoader::AUTOLOAD
0.66 0.030 0.050 10 0.0030 0.0050 Getopt::Long::BEGIN
0.22 0.010 0.010 1 0.0100 0.0100 mhonarc::OSinit
0.22 0.010 0.010 1 0.0100 0.0100 mhonarc::create_routines
0.22 0.010 0.010 2 0.0050 0.0050 Time::Local::BEGIN
0.00 0.000 -0.000 1 0.0000 - Getopt::Long::ConfigDefaults
0.00 0.000 -0.000 3 0.0000 - Exporter::import
Author sorting speed approvement is quite drastic. The big reduction
is in the number of calls to mhonarc::extract_email_name. In v2.4.0
mhonarc::extract_email_name is called only once per message in
mhonarc::sort_messages (the reason it shows up twice the number of
messages above is that it is called for each message when generating
the -scan output). In v2.3.3, mhonarc::extract_email_name is called
67736 (70699 - 2963) times within mhonarc::sort_messages.
--ewh
P.S. I am unsure what mhonarc::BEGIN is all about. v2.3.3 does
not even defined any BEGIN blocks, and v2.4.0 only has one.