procmail
[Top] [All Lists]

Re: Catch-all list filter

2008-01-03 13:52:01
In message <477D3AD5(_dot_)80005(_at_)dd-b(_dot_)net>,
        David Dyer-Bennet (dd-b(_at_)dd-b(_dot_)net) wrote:
N.J. Mann wrote:
In message <477C179E(_dot_)4050504(_at_)dd-b(_dot_)net>,
    David Dyer-Bennet (dd-b(_at_)dd-b(_dot_)net) wrote:
  
[...]
Anyway, what I'm trying so far is:


    :0
    * ^List-Id: \/.*$
    {
        gname=` echo $MATCH | tr -d '<>' | tr -t '. /' '_' `
        LOG="gname $gname
    "
    

You have already had a action for this recipe, so you need to start a
new recipe before your second action.

          :0
  
        .Auto.$gname/
    }
    

Either an assignment isn't an action line, in which case ".Auto.$gname/" 
is the first action line in that recipe, or else I've already had *two* 
action lines (the two assignments, both of which worked).   So that 
can't be the explanation.

The opening brace is the action.  See the "Recipe action line" section
of the procmailrc(5) manual page.  Assignments don't count and may
appear (almost) anywhere.

I haven't tested yet, but I'm perfectly willing to believe you've 
pinpointed the *problem* (and your solution will no doubt work).  But 
the explanation is internally inconsistent, so it doesn't help me 
*understand* yet, and avoid such mistakes in the future.

Sorry about that, does it make sense now?

I'm poking at 
the man pages to see if I can get the details straight in my head, with 
the right names attached (I keep calling things the wrong names thinking 
about these bits, which partly explains my confusion), but I wanted to 
get a response out rather than just leaving your useful suggestion 
sitting there.

Anyway, thank you! 

Your echo/tr/tr system may work, but an all procmail solution may be
better.  There have been a number of solutions to the "filing list mail"
problem posted to this list in the past so it is worth searching the
archives.  My own solution is as follows:

    :0
    * 9876543210^0 ^List-Id:.*<?\/[a-z0-9_-]+[.]
    * 9876543210^0 ^List-Post:.*<?\/[a-z0-9-]+[(_at_)]
    * 9876543210^0 ^Delivered-To:.*<?\/[a-z0-9-]+[(_at_)]
    {
        :0
        * MATCH ?? ^^\/[a-z0-9_-]+
        { LISTNAME = $MATCH }

        # special processing for mailman messages (e.g. monthly
        # subscription reminders) removed from here to simplify example

        :0
        $LISTNAME/
    }

This handles all but one of the over fifty mailing lists I receive mail
from.
  

I was unhappy about the cpu cost of forking the multiple external 
programs (though...it's not my cpu :-)).  So looking at some 
alternatives is definitely something I'm interested in.   I was thinking 
of writing an external program to do more precisely what I wanted, but a 
pure-procmail solution that does something good enough may be the way to 
go.

I have found that sometimes you just have to call an external programme,
but I try do that as infrequently as possible.

The <bignum>^0 notation is scoring, right?  I need to read up more on 
that to figure out what that's accomplishing in your recipe.

Yes, this is using scoring.  The big number means that if the line
matches the scoring threshold will be reached since the number is
greater than 'plus infinity' (see the MISCELLANEOUS section of the
procmailsc(5) manual page).

By using scoring in this way we get a logic OR in a recipe - procmail
only has logic AND, so a bit of improvisation is needed. :-)  My recipe
greps the List-Id header (if there is one) first and if that fails then
greps the List-Post header and if that fails greps the Delivered-To
header.

Why the "[(_at_)]" rather than just "@" (second two regexps) (or "[.]" in the 
first regexp)?  I don't see how a character class that matches only a 
single character differs from that character appearing literally?

. (dot) is special in procmail regexps and so needs escaping when you
want to match a real dot.  Here I use the character class delimiters to
escape it - it also makes it standout so that _I_ notice that is what I
will get.  As for the [(_at_)], I could probably just have used an @ on its
own, but, again, it does highlight for me, the human with the poor
memory, what the final character of the match is.

I actually expected you to ask why there is a question mark after the <
since RFC2919 states that the angle brackets are mandatory.  However, in
the real world there is mailing list software (possibly misconfigured)
which does not enclose the list-id in angle brackets. :-(

And looking at yours has already told me something about why mine didn't 
match as far in as I had expected it to, so that's useful!

I'm actually going to have to put in explicit recipes to *keep* some 
mailing lists from going into special directories -- a number of 
announce lists I want to show up in my normal email, so I'll read them.  
I find this amusing.

You can do that once you have LISTNAME set, but before the delivery
recipe, e.g.

  :0:
  * LISTNAME ?? ^free-pizza-tonight-announce-list$
  $DEFAULT

  :0
  $LISTNAME/

(My DEFAULT mailbox is the system one which is mbox format, hence
locking is required.)

Again I hope the above helps.


Cheers,
       Nick.
-- 

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>