procmail
[Top] [All Lists]

Re: Counting recipients?

1999-03-01 09:46:06
On Mon, 01 Mar 1999 15:38:09 +0200, Yossi Gil 
<yogi(_at_)cs(_dot_)technion(_dot_)ac(_dot_)il>
wrote:
I would like to sort my mail based on the number of recipients. The more
recipients on it, the lowest its priority. Any tips on  how I might go
about doing this?

This is tricky to do right in the general case, but if you can
tolerate a few plus/minus one errors, here's something to get you
started. 

    :0
    * 1^1 ^TO_
    * 1^1 ^TO_[^,]+,\/.*
    {
        :0
        * $ $=^0
        * 1^1 MATCH ?? ,
        { }

        :0
        * $ $=^0
        your-action-here
    }

The basic idea is to count each ^TO_ line, and then count the number
of commas in each ^TO_ line and add a score of one for each. This
sucks in practice because (1) there is no construct for looping over
all ^TO_ lines, so I cheat and assume that if there are ^TO_ lines
with many recipients in them, there won't be many ^TO_ lines -- the
inner condition counts commas in the +last+ ^TO_ line with a comma in
it. And (2) there is nothing to protect you from commas in comments.

Consider this:

 From: era(_at_)iki(_dot_)fi
 To: abuse(_at_)foo(_dot_)org (close down your spam-infested relay, please)
 Cc: abuse(_at_)crl(_dot_)com (yes, I want another bounce from your broken 
ignorebot, thx)
 Subject: Spam: [g0rb13(_at_)hotmail(_dot_)com: $50M PUR WEAK, 
PROMMISE!!11!!!!!]

This would get a score of five, because of the commas in the
parentheses, when in reality there are only two recipients. You can
try to come up with a regular expression to skip over comments (the
^TO_ regular expression might be worth studying) and so forth, but it
really is theoretically impossible to do this in Procmail alone.
(Hint: If you know Chomsky's language types hierarchy, you know this
one. If not, you probably don't want to know.)

If you really want to do this right, spend the rest of your days
writing a correct RFC822 parser and then use that to count the number
of recipients. (I think Dan Bernstein has a publically available
parser, and there's something RFC822-related on CPAN which I haven't
checked out but which might be useful for this.)

Hope this helps,

/* era */

Dan B's stuff is at <ftp://koobera.math.uic.edu/www/proto/immhf.html>.
For more on this, see <http://www.iki.fi/~era/procmail/links.html#links>

begin:vcard 
n:Gil;Joseph (Yossi)

Please turn off this annoying misfeature. Thank you.

-- 
.obBotBait: It shouldn't even matter whether    <http://www.iki.fi/~era/>
I am a resident of the state of Washington. <http://members.xoom.com/procmail/>

<Prev in Thread] Current Thread [Next in Thread>