procmail
[Top] [All Lists]

Re: Filtering Yahoo! groups using $MATCH on subject line

2003-05-27 14:50:23
At 16:20 2003-05-27 -0400, Birl wrote:
I searched the archives for this, but couldnt find a good answer.

Look deeper, Grasshoppah.

I have a few filters already in place, but I want to throw this in a
catchall.

I was looking at MATCH in procmailrc file and it mentions that everything
to the right of the \/ is captured.  Sounds like greedy pattern matching
to me.

Is there any way to stop the MATCH short of eoln?  Im thinking if I place
(a) extra character(s) after \/, it will stop matching.  Am I correct in
this assumption?

Arguably, you could use the Mailing-List header to accomplish this task, and that won't be complicated by reply prefixes.

Keep in mind that Re: comes in many forms, and thus such response headers (which will be prefixed to the bracketed listname) will cause grief.


* ^(To|Cc):(_dot_)*(_at_)yahoogroups\(_dot_)com

A Bcc will send you for a tailspin.

# Subjects usually start off with "[groupname] subject title" (without quotes)
 * ^Subject:.[\/]

This is syntactically incorrect, as it doesn't allow for anything between the braces, and also doesn't allow for anything more than a SINGLE character before them. So, This won't match:

        Re: [somelistname] subject text

At a minimum, to you'd need:

        ^Subject:.[\/.*]

But that's no good either, since it'll include the closing brace in the match text (and has other troubles as well).

Without the support for handling the leading Re: (in various forms), this would be fine for digests and initial message submissions only. Additionally, if someone sent a message to the list with:

Subject: [bullcrap] Re: [mailinglistname] original subject

Then the mailing list is unlikely to re-add it's tag (since the text is already present on the line), but you'll happily file the message away in "bullcrap".

 $LASTFOLDER=$MATCH

You shouldn't assign it to lastfolder - procmail will set this variable when you actually STORE the message.

Try this:

:0
* ^Mailing-List: list \/[-_a-z0-9]+(_at_)yahoogroups\(_dot_)com
{
        :0:
        * MATCH ?? ()\/[^(_at_)]+
        $MATCH
}


The pertinent log from my sandbox (see my .sig, you should try developing your filters in a sandbox):

procmail: Assigning "MATCH="
procmail: Matched "mailinglist(_at_)yahoogroups(_dot_)com"
procmail: Matched "mailinglist"
procmail: Match on "()\/[^(_at_)]+"
procmail: Locking "mailinglist.lock"
procmail: Assigning "LASTFOLDER=mailinglist"
procmail: Opening "mailinglist"
procmail: Acquiring kernel-lock
procmail: [25150] Tue May 27 14:14:04 2003
procmail: Unlocking "mailinglist.lock"
From sentto-coded-myaddress(_at_)returns(_dot_)groups(_dot_)yahoo(_dot_)com  
Wed May  7 04:34:38 2003
 Subject: [mailinglist] Digest Number 51
  Folder: mailinglist                                               11679


Note that generic list identification rules have been posted to this list in the past - you can identify the listname component on a great number of lists without having to resort to the subject line. The one I've been using is:


:0
* 9876543210^0 ^(Sender:[ ]*owner-|X-BeenThere:[ ]*|Delivered-To:[ ]*mailing list )\/[-A-Za-z0-9_+]+ * 9876543210^0 ^(List-Post:[ ]*(<mailto:)?|List-Owner:[ ]*(<mailto:)?owner-)\/[-A-Z0-9_+]+
* 9876543210^0 ^Sender:.* List"? <(mailto:)?\/[-A-Z0-9_+]+
{
        LISTNAME=$MATCH
}

:0E
* ^Sender:[     ]*\/[-A-Z0-9_+]+-owner
{
        LISTNAME=`echo $MATCH | sed -e s/-owner//i`
}

# Optional - if the listname isn't blank, emit it to the logfile.
:0
* ! LISTNAME ?? ^^^^
{
        LOG="List: $LISTNAME$NL"
}


The above works suitably on yahoo lists as-is.

The one thing to keep in mind is that offlist replies from members (which retain the subject tag) won't be matched - but then, if you're relying upon the (To|Cc): to hold true, they won't match your check anyway if they're purely offlist. I typically filter out messages which are addressed to me AND to a known-list to which I subscribe, right after storing the messages which are delivered to me THROUGH the list. I have my email client set up to highlight messages which contain my address as a cleartext recipient, so followups which include my address are flagged independant of me having to contend with two copies of the message (though, if I'm participating in a thread in the first place, I don't need to be separately copied on it in the first place, and doing so often invites other people to reply in a like manner, which results in the continuing thread to be flagged for my attention, which is a bother, and is just one of the many reasons I discourage cc's on lists).

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail