procmail
[Top] [All Lists]

Re: Spammish?

2003-02-15 20:01:29
At 16:05 2003-02-15 -0500, fleet(_at_)teachout(_dot_)org wrote:
The only "body" recipe I have is for Nigerian Scam messages.  It's also
the only scoring recipe I have - and it works well; although I've had to
tweak it twice in the past month.

I personally save body scanning reciped until the end of my spam checks, since they're potentially so much more costly. Yes, I have to tweak my nigerian spam recipe every so often as well. Different "offical" offices, countries, and other keywords.

> Messages with cleartext addressees that don't include you,

This is the only recipe I have with a "white" list.  (If not addressed to
somebody at one of the domains I administer it's spam - this allows quite
a bit of spam to flow through.

The trick is that regulae mailing lists, or legit bcc's to you will also trip on such a rule. Which is why it should be no more than a spammy indicator - only in conjuction with other indicators would such a message be determined to be spam.

I'm looking for "webmaster@", "info@", etc. which *could* be valid mail; and

there are some good recipes which look for an abundance of official-type addresses (multiple-domain spams).

My lists recipes are pretty solid.

Good. I categorize lists as "clean" and "dirty" - the former allow only posts from legit users (or are outbound-only lists), and can be filtered before spam filters, and the latter are filtered after spam filters - lists which have spam posts.

> Messages not from: your domain, but with the messageid containing your
> domain.

One of my more "active" filters.

Keep in mind that there's a lot of legitimate email being sent around by braindead mailers that arrives without a MessageID. Alone, this isn't a good determinant for spam - but it is a good spammy flag.

Don't think I've ever seen a message with less than three Received:
headers.  (Again, I admit to limited experience.)

If the spammer has a client that is connecting directly to your mailsever and injecting the message there, you can end up with a single Recieved: header.

However, a locally originated message can also have a single Recveived: header, which is why this is only used as a spammish test.

IME, anything non-local with less than three recieved headers has a degree of spammyness.

> Messages passing through your backup mx (when you know that it is virtually
> NEVER involved with email).

I don't even know what a "backup MX" is.  As far as that goes, I'm not
sure I have any idea what a primary MX is.

This is significant to those who administer their own mail servers. Basically, a good mail config doesn't rely on your mail sefver being up 24/7 (yes, ideally it would be, but Bubba's Backhoe Service and other interruptions of network service can mess with your servers availability), but whenever it isn't, a properly configured domain will have one or more _backup_ mailservers. The DNS record which represents incoming mail servers is the "MX" record - Mail eXchanger.

Some spammers choose to deliberatley deliver their email to the backup mail server instead of the primary mail server, just to circumvent DNSBL and other access limits often employed on a primary mail server, but which are usually omitted from a secondary (which may not be administered by the same domain, and may be a backup for multiple domains). You wouldn't want someone else making decisions about your mail filtering, I assume?

Ok.  Currently I use LOG to record the "rule" that identifies a message as
spam.  Looks something like:

But LOG emits it to the log - the rest of the recipe doesn't have realtime access to that information.

> :0
> * some spammish test
> {
>          XSPAMSCORE="${XSPAMSCORE} spamtest 23;"
> }

Wouldn't the above need to contain the c flag in order to allow the
message to continue through the filters?

No, as it doesn't _deliver_. Thus, it just sets that variable and the procmailrc continues along to the next recipe. Which has access to that variable. If you look at the assignment, you'll note that it tacks the new result onto the end of the previous contents of the same variable.

If I just passed the message down (using :0 c) and continued using the
logging, I would obtain much the same result I think.  It would tell me
which recipes were most "active."

same idea, but as above, setting a variable doesn't count as delivery, so 'c' is unnecessary. In fact 'c' is undesireable, as you'd end up with TWO copies - and if you had many recipes in a row, each one would be doubling the message count!

But how do I tie this to the message so I can file the spam when it has
completed a trip through the filters (without using formail)?

The VARIABLE.  It remains set as you progress through the procmailrc.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>