I'm on a few mailing lists, which are populated by certain
people, whom I'll call "pundits", who post the same message/article
to multiple lists (as separate distinct messages). I've grown tired
of reading their contributions in duplicate/triplicate, and frankly,
want to relegate them to a separate lower priority folder,
for less frequent review. To do this, I came up with the
following recipe:
:0:
* PUNDIT ?? ^^YES^^
{
T400=`formail -I '' | tr -c '[:alpha:][:digit:]' '_' |
tr -s '_' | head -c 400`
:0
* !? echo "Message-ID: $T400" | formail -D 40101 $HOME/.pundit.cache
pundit-mail
:0E
/dev/null
}
If the message has been determined to be from (or refers to) a "pundit",
then PUNDIT=YES. In that event, we take roughly the first 400 characters
of the body of the message and deposit that into the variable $T400. Note
that convert all non- alphanumerics to '_' and
then eliminate duplicates. The choice of remapping character is
unimportant. Once we have a string that is representative of the
message, we prefix it with Message-ID: and feed that into formail -D
to see if we've seen this message prefix before. If this is the
first occurrence, we deposit the message into pundit-mail, otherwise
it is ditched into /dev/null. The size of the cache (40101) is tuned
to ensure that we cache at least the last 100 messages (proof left
to the reader). Note: we limit the string length to 400 to step around
potential problems with LINEBUF, shell environment variable size limits
and so on. It could likely be set to a somewhat larger value without
problems.
Comments? Suggested improvements?
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail