:0
* 1^1 ^Received:
{ countRCVD = $= }
Thanks! The existence of that variable wasn't very obvious in the man
pages; it's in procmailsc(5) but only in BUGS in procmailrc(5). :)
You're welcome.
* COUNTReceived ?? ^1$
Perhaps you meant the more canonical
* COUNTReceived ?? ^^1^^
Is one format superior to the other? I don't see the difference,
unless perhaps the ^^ form gets parsed faster. Is it significant,
or just an equivalent alternative?
The ^ and $ imply line start or end (they are interchangeable
in procmail, but we tend to use them linearly). Actually,
they each mean the literal newline char. "^^" means the
leftmost edge or rightmost edge of the field being examined.
If I've misstated something, I look forward to correction.
:0 # if it's local mail (including via our mailhost), deliver it
* $ $INFINITY^0 ^Received:.*\<myispname.com \[566\.684\.
* $ 2^0 ^Message-ID:[$WS]*<[^$WS]+(_at_)localhost>$
* -1^2 ^Received:
$DEFAULT
This counts the Received: headers at the same time that it's
conducting the reasonable secure test of a valid Received: line.
If there are too many, it won't consider the mail local.
But if you need to do that count more than once, isn't it
faster to use
a result stored in a variable? So for sendmail, maybe something like:
MYDOMAIN=| hostname | sed "s/`hostname -s`\.//"
I just have a real aversion to piping to two processes on every mail,
for something we can reasonably expect to get within procmail. Surely
the host name, or [127.0.0.1], or "localhost" is stated in the
top Received: header? Even if you do want to run hostname, you
could use MATCH to kill the TLD stuff and avoid sed.
Besides, that syntax for var assignment b0rked procmail 3.22/3.23pre
on an Alpha system I run procmail on. (Known occasional bug.)
If you really want this variable every time, how about feeding it
to procmail via an INCLUDERC?
As to your question about counting Received: headers, we have
done that already above. We just assign the value of the score
to a variable. We might have to play with the choice of the
scoring exponent. I chose "2" because I figured it makes
visceral sense that the more Received: headers there are,
the further away from "clean" we move exponentially. But
if you're going to use the count, well, just write it "1^1"
instead.
Okay. :) Here's my "ATCOUNT" thingee:
Cool! Thanks! But...
:0 # add the subtotals, subtract 4 "gimmes"
* $ $=^0
* -4^0
{ TOO_MANY = $ATCOUNT }
Is the TOO_MANY variable actually useful for anything? Aren't
cases where there are more than two CC recipients *really* common?
Yes, but I use the variable in concert with other tests to
decide if it's spammy. Am I on the To: line? Am I on the
Cc: line(s)? Is the Subject: empty? Is the Message-ID:
putatively valid? (Lots of legit mail has Message-ID's that
violate RFCs, including Microsoft Exchange's format, I believe.
So I don't kill based only on that, but combine it with other
of what I call "indicia" (word of art taken from Supreme Court dicta
discussing the 13th Amendment). Are there any spaces in the
From: line's text? Etc. These all, taken in various combinations,
comprise what I call a calculus of spammy stuff. I look for
a spammy calculus. I'll admit, though, that a forged hotmail
or yahoo address is a dead ringer. :)
--
Dallman Ross
"If you find a path with no obstacles, it probably does not lead to
anywhere."
Thoughts of Rev. Sunnan Kubose, from _Zen in the Markets_
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail