procmail
[Top] [All Lists]

Part 2 of: Simple recipe to move uninteresting threads in separate mailbox

2007-01-13 00:30:33
Greetings, everybody.

I have been testing since Jan 1st the recipe that Sean created to help me
(see archives) in the version below, which is the last one he posted. The
only difference from where we left is that I have not configured my MUA
yet to create another cache file via formail because I'm having problems
on that side which are irrelevant here right now.

Here are some general considerations and report from the field:

0) To Sean: LinuxJournal gives 100 USD to authors of useful Linux
   tips, and I think you really deserve all of them for this
   recipe. Contact me off list to know how to proceed if you're
   interested. Also because this is too good to not let people know,
   if you don't (and explicitly say so on list) do it I will :-)

1) The recipe works great! Since Jan 1st I have received:

    948 spam messages
   2550 irrelevant messages filed separately by this recipe!!!

   In other words, this recipe already made me save ~2.5 more *time*
   than all antispam measures, and it could be even better, see below

2) A couple of general considerations, more for "documentation"
   purposes than anything else, really:

   There is no (easy) undo. If you tag a message as irrelevant and
   then realize it wasn't, you have to stop email processing, open the
   message to get its Message-Id and manually remove that string from
   all the caches used by the recipe. Not a problem for me but beware!

   Obviously, this saves no bandwidth at all. You still have to
   download all the irrelevant messages to the box where procmail runs
   before discarding them. So this recipe will save you *lots* of
   time, but no money if procmail runs at the receiving end of a
   metered connection

3) improvements: I have no idea how to improve performances. Comments?
   (It creates _no_ performance problems on my PC, just curious). This
   said, the only real limit here is the fact that too many people
   must or want to run broken email clients which don't add I-R-T or
   References headers.

   So I tag a message from my MUA, procmail sees its Message-ID in the
   ignore.mua.cache and stops all the sub-threads linked to that I-R-T
   or Reference string. But if some "differently able" user replies
   without those headers, and 50 people reply to him, I see and have
   to retag manually all that sub-thread. And do it again when that
   person replies again. This is the only thing missing from an
   already excellent recipe. As I understand it, we need another cache of
   subjects lines here, right? So that if there is no I-R-T but the subject
   is one that was labeled as irrelevant, the recipe repeats the same job:
   add the MsgId of this message to the cache and file it away.

Thanks again to Sean for this great help!

       Marco

#============================================================================
# simple recipe to ignore threads based on prior cache of threads to ignore.
# 20061230, SBS

# get In-Reply-To messageid, check to see if it is in the ignore cache or
# in the mua_ignore cache.  formail stores cache with ascii-z terminations,
# but grep will still match the binary file.
# if we have a match in the MUA id file or current cache, ADD the messageid
# of THIS message to the cache, so that replies to it will also be ignored.

# ensure these are blank, not set to something you might have used them for
# previously
REFS=
REFSNL=

:0
* In-Reply-To:.*\/[^    ].*
{
         # Assign the results to REFS
         REFS=${MATCH}
}

:0
* ^References:.*\/[^    ].*
{
         # Append the results to REFS
         # no consideration as to whether REFS was null or not.
         REFS="${REFS} ${MATCH}"
}

# by doing this ONLY if REFS contains non-whitespace, we spare
# ourselves the overhead of the pipe chain invocation when it isn't
# needed (i.e. messages with no references).  Arguably, REFS shouldn't
# be set at all if the headers are empty, but this check is cheap to perform
:0
* REFS ?? [^    ]
{
         REFSNL=`echo "$REFS" | tr -s "  " "\n\n" | \
                 sed -e '/^\([^<].*\|.*[^>]\|\)$/ d'`
}

:0hc:ignore.cache$LOCKEXT
* REFSNL ?? .
* ? grep -qF "$REFSNL" ignore*.cache
| formail -D 40000 ignore.cache

# if the preceeding conditions matched, then file this message
# away as irrelevant.
:0A:
irrelevant.threads

#============================================================================


-- 
The right way to make everybody love Free Standards and Free Software:
http://digifreedom.net/node/73

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>
  • Part 2 of: Simple recipe to move uninteresting threads in separate mailbox, M. Fioretti <=