MHonArc memory usage (was Re: Mhonarc question... )

1998-01-20 17:36:37
Someone here at Stanford messed around with HyperMail a while back and said
it was a terrible memory hog - it had to get all the messages it wanted in
an archive into memory, then start processing them. 

Is Mhonarc is better at dealing with memory usage?  Do you have any info on
how much DRAM I would want for a mailing list with X messages?  Is the
memory usage growth linear?  Is there any way to limit it?  Anyway, you get
the general idea - any info you can give about its resource usage would be
greatly appreciated.

MHonArc is not the nicest when dealing with memory, but there are
some options available to the user to try to minimize memory usage.
(Note: Perl itself has its own memory overhead.)

The default behavior of mhonarc is to load all input into memory for
doing the disk writes (the exception are MIME attachments thats are
written to separate files; the attachment data is not kep in memory,
but it loaded into memory before writing to disk).  To minimize message
data eating up memory, the -savemem option can be used to force mhonarc
to do initial writes of message data to disk.  This should save memory
usage, but more disk I/O is required since another pass of the files is
need to add the navigation links.  Regardless, data associated for
creating the index pages and threads are always loaded into memory.  An
idea of the size of the data can be determined by looking at the
.mhonarc.db file that is created.

Another technique is to call mhonarc individually, in sequence, for
each message (or a set number of messages) to be added to an archive.
This will reduce memory since memory is cleared from each program
invocation.  However, total data processing time will increase due to
overhead of stop/starting of mhonarc.  Also, .mhonarc.db will still
increase in size, so memory usage will still increase overtime as the
archive gets larger (but hopefully within acceptable limites)

Some users choose to create multiple archives based upon a well-defined
time period (eg: monthly), so a single archive will not get too large
so incremental updates are performed in acceptable times.  The mhonarc
mailing list archive is set up in such a manner.

Single large archives can be used if the archive is truely an archive
and will not get modifed at all, or rarely.  This way message threads
will not lose continuity when using multiple smaller archives.

This message is being cc'ed to the mhonarc mailing list since users
may want to contribute system configuration and memory stats on
using mhonarc.


<Prev in Thread] Current Thread [Next in Thread>
  • MHonArc memory usage (was Re: Mhonarc question... ), Earl Hood <=