Re: 1. Useless cat. 2. Counting spams

At 01:59 2002-05-17 +0100, Alan Clifford wrote:

The concept I'm toying with is to let the spammers automatically censor
themselves.  And doing it with IP numbers seems an interesting route for
me to follow up.

If you have access to the named config on your host (or any host), it isquite easy to create your own private (or public) DNSBL. There areprocmail filters that check DNSBLs, and of course, you could just configureyour MTA (if you're the admin on your mailhost) to utilize your private DNSBL.

I produced one using data available from APNIC, which handily allows me torefuse SMTP connections from all hosts within CN, TW, KR, IN, and ID,regardless of what their claimed hostname is.


This approach has several inhernet beauties:

        If you reject the message at the SMTP transaction, you're not
        accepting the traffic associated with what could be a large message.

        By rejecting the message at the SMTP transaction, you're giving the
        sender a clear message that you're not accepting their message, and
        why -- without having to send an autoresponder message (which might
        possibly be sent to the wrong address).

        The DNSBL is by it's very nature a distributed database - other mail
        hosts on your network can all make use of it.

I have a regular source of spam - all email to the address I use in usenet
is spam so these can provide the data to censor themselves when they send
emails to my real addresses (that have leaked out unfortunately)

Sounds like you're talking about "peppering". Postal mailing listcompanies would "pepper" their database with bogus addresses which wouldreally deliver to the mailing list company, so they could find out whensomeone was illegally re-using a database. By applying the same concept toemail, you're saying if you receive email at the pepper address, whatevercharacteristics you can glean from that message - be it the sender, thesubject, or the origin host, you'd add it to a db your regular mailaccount(s) would use to reject mail.

Of course, you need to know what to extract from the pepperedmessage. Blocking an ISP outright because someone forgest them in theFrom: would be bad.

Over the past few days, I have been collecting the addresses and domains
in a database.  As I suspected, there doesn't seem to be much reuse of
email addresses but the domains are more interesting:

So the idea is that orgio.net, with 9 spams, last one on 16th May, puts
itself into the blacklist and stays there until the last spam is, say,
over a month old.

The problem is with automatically determining the validity of the spamsource - it's _trivial_ to forge the envelope sender.

And maybe email addresses are recycled over time as well - I can just wait
and see.

I don't bother to block specific addresses (except as _actual_ user twits),but rather use patterns - friend@ for instance.

Extending this to IP numbers would be interesting.  I'd have to hack these
out of the received headers I guess.  How careful would I have to be?  Is
the earliest header always the one furthest into the email?

Unless it is forged, which isn't particularly uncommon with spam. Also,some mailing lists (which presumably your pepper address woulnd't be_validly_ on), discard the received headers up until the mailing listreception of the message (for a while, the procmail list was doing that,which was extremely annoying as a lot of spam would have been avoidedotherwise), so the IP chain will be incomplete - if you ID something asspam and add the apparent origin to a db, you're potentially shootingyourself in the foot.

Also, I'd also be interest in comments on which "from" header to use.
Currently, I'm using the formail -t header but have just started another
database this evening using the FROM_ header.  There doesn't seem to be
much point in using the formail without_the_t as the spams seem mostly
have a Reply-to: header in them.

Most spams I've seen in recent years don't even have a valid address -either the From: is totally bogus (or is a tossaway freemail account), andat best, the Reply-To: is some removal service (which you'd be fool toactually _send_ anything to in any event), which is usually also on afreemail service and with any luck is closed out within hours of the spamincident.

I personally reject as forged any message from a handful of the bigfreemail services if the From: header and the Message-id con't correlate inan acceptable fashion. Sure, I toss a certain amount of legit email(people using the From: on regular email passed through their own ISP'smail server), but as yet, I have't misplaced anything that I was concernedabout.


---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail