procmail
[Top] [All Lists]

Re: Some ideas to improve performance

2004-11-24 07:25:58
Ruud H.G. van Tol wrote:
[...]
That is what you think that you need. I know you don't.

Another few questions/suggestions that come to mind:

1. Is whatever procmail is handing the messages *to* is capable of handling more than one at a time. Would an optimized "procmaild" actually be able to hand off the messages any faster? If mhonarc is "a Perl program for converting mail or news messages into HTML archives", is it going to be able to handle multiple simultaneous messages?

2. If ALL of the 10,000 messages are ultimately being handed off to the same receiving program, is procmail really necessary? Could they simply be handed off by formail? Is there another program that could convert the existing file to the UUCP mailbox or MH mail folders required by mhonarc?

3. If only a SUBSET of the 10,000 are to be processed, then perhaps use formail + procmail to sort them into folders as individual message files an initial pass, and have the "other thing" running in parallel on the resulting files as they're sorted? In other words, have the receiving program run interatively over the sorted folder. (This might have the additional advantage of being kept running on an ongoing basis. procmail receives incoming messages and places them in an appropriate folder, then the other program is run periodically to process them as needed.)

4. Is the need to sort 10,000 messages like this a recurring requirement, or is this just a one time thing?

I'm in the process of re-training various bayes tools with a collection of 14,000+ spam and 6,000+ ham, so am sympathetic. However, in my (admittedly limited) experience, procmail hasn't been the bottleneck when calling other programs to do such work.


- Bob




____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail