Richard asked,
| I have many large mailboxes (a few thousand messages in each) and I'd like to
| be able to split them quickly (one-message-per-file style) for subsequent
| searching/processing. I thought to use formail but it turns out that
| splitting with formail anf feeding to procmail to do the file writing takes a
| *very* long time.
You didn't say exactly how you're invoking formail and procmail, so perhaps
they can be sped up, but maybe you could use csplit instead?
What seems to slow things down the most is procmail's locking attempts.
This code:
#!/bin/sh
export mailbox
for mailbox in pattern
do FILENO=00001 formail -ns sh -c 'cat > $mailbox.$FILENO' < "$mailbox"
done
despite the invocations of sh and cat, ran much faster than
#!/bin/sh
export mailbox
for mailbox in pattern
do FILENO=00001 formail -ns \
procmail -pm DEFAULT=$PWD/$mailbox.'$FILENO' /dev/null < "$mailbox"
done
(and I'm still not sure why I needed to give a full path for $DEFAULT instead
of assuming $PWD as the start when procmail had the -m option).
But the fastest thing I tried was to use procmail but prevent the locking;
where .splitrc had this code,
:0
$mailbox.$FILENO
this ran like the wind in comparison:
#!/bin/sh
export mailbox
for mailbox in pattern
do FILENO=00001 formail -ns procmail -pm ./.splitrc < "$mailbox"
done
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail