Re: Spammish?

At 16:05 2003-02-15 -0500, fleet(_at_)teachout(_dot_)org wrote:

The only "body" recipe I have is for Nigerian Scam messages.  It's also
the only scoring recipe I have - and it works well; although I've had to
tweak it twice in the past month.

I personally save body scanning reciped until the end of my spam checks,since they're potentially so much more costly. Yes, I have to tweak mynigerian spam recipe every so often as well. Different "offical" offices,countries, and other keywords.

> Messages with cleartext addressees that don't include you,

This is the only recipe I have with a "white" list.  (If not addressed to
somebody at one of the domains I administer it's spam - this allows quite
a bit of spam to flow through.

The trick is that regulae mailing lists, or legit bcc's to you will alsotrip on such a rule. Which is why it should be no more than a spammyindicator - only in conjuction with other indicators would such a messagebe determined to be spam.

I'm looking for "webmaster@", "info@", etc. which *could* be valid mail; and

there are some good recipes which look for an abundance of official-typeaddresses (multiple-domain spams).

My lists recipes are pretty solid.

Good. I categorize lists as "clean" and "dirty" - the former allow onlyposts from legit users (or are outbound-only lists), and can be filteredbefore spam filters, and the latter are filtered after spam filters - listswhich have spam posts.

> Messages not from: your domain, but with the messageid containing your
> domain.

One of my more "active" filters.

Keep in mind that there's a lot of legitimate email being sent around bybraindead mailers that arrives without a MessageID. Alone, this isn't agood determinant for spam - but it is a good spammy flag.

Don't think I've ever seen a message with less than three Received:
headers.  (Again, I admit to limited experience.)

If the spammer has a client that is connecting directly to your mailseverand injecting the message there, you can end up with a single Recieved: header.

However, a locally originated message can also have a single Recveived:header, which is why this is only used as a spammish test.

IME, anything non-local with less than three recieved headers has a degreeof spammyness.

> Messages passing through your backup mx (when you know that it is virtually
> NEVER involved with email).

I don't even know what a "backup MX" is.  As far as that goes, I'm not
sure I have any idea what a primary MX is.

This is significant to those who administer their own mailservers. Basically, a good mail config doesn't rely on your mail sefverbeing up 24/7 (yes, ideally it would be, but Bubba's Backhoe Service andother interruptions of network service can mess with your serversavailability), but whenever it isn't, a properly configured domain willhave one or more _backup_ mailservers. The DNS record which representsincoming mail servers is the "MX" record - Mail eXchanger.

Some spammers choose to deliberatley deliver their email to the backup mailserver instead of the primary mail server, just to circumvent DNSBL andother access limits often employed on a primary mail server, but which areusually omitted from a secondary (which may not be administered by the samedomain, and may be a backup for multiple domains). You wouldn't wantsomeone else making decisions about your mail filtering, I assume?

Ok.  Currently I use LOG to record the "rule" that identifies a message as
spam.  Looks something like:

But LOG emits it to the log - the rest of the recipe doesn't have realtimeaccess to that information.

> :0
> * some spammish test
> {
>          XSPAMSCORE="${XSPAMSCORE} spamtest 23;"
> }

Wouldn't the above need to contain the c flag in order to allow the
message to continue through the filters?

No, as it doesn't _deliver_. Thus, it just sets that variable and theprocmailrc continues along to the next recipe. Which has access to thatvariable. If you look at the assignment, you'll note that it tacks the newresult onto the end of the previous contents of the same variable.

If I just passed the message down (using :0 c) and continued using the
logging, I would obtain much the same result I think.  It would tell me
which recipes were most "active."

same idea, but as above, setting a variable doesn't count as delivery, so'c' is unnecessary. In fact 'c' is undesireable, as you'd end up with TWOcopies - and if you had many recipes in a row, each one would be doublingthe message count!

But how do I tie this to the message so I can file the spam when it has
completed a trip through the filters (without using formail)?


The VARIABLE.  It remains set as you progress through the procmailrc.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail