procmail
[Top] [All Lists]

Re: 1. Useless cat. 2. Counting spams

2002-05-17 02:03:25
At 01:59 2002-05-17 +0100, Alan Clifford wrote:

The concept I'm toying with is to let the spammers automatically censor
themselves.  And doing it with IP numbers seems an interesting route for
me to follow up.

If you have access to the named config on your host (or any host), it is quite easy to create your own private (or public) DNSBL. There are procmail filters that check DNSBLs, and of course, you could just configure your MTA (if you're the admin on your mailhost) to utilize your private DNSBL.

I produced one using data available from APNIC, which handily allows me to refuse SMTP connections from all hosts within CN, TW, KR, IN, and ID, regardless of what their claimed hostname is.

This approach has several inhernet beauties:

        If you reject the message at the SMTP transaction, you're not
        accepting the traffic associated with what could be a large message.

        By rejecting the message at the SMTP transaction, you're giving the
        sender a clear message that you're not accepting their message, and
        why -- without having to send an autoresponder message (which might
        possibly be sent to the wrong address).

        The DNSBL is by it's very nature a distributed database - other mail
        hosts on your network can all make use of it.

I have a regular source of spam - all email to the address I use in usenet
is spam so these can provide the data to censor themselves when they send
emails to my real addresses (that have leaked out unfortunately)

Sounds like you're talking about "peppering". Postal mailing list companies would "pepper" their database with bogus addresses which would really deliver to the mailing list company, so they could find out when someone was illegally re-using a database. By applying the same concept to email, you're saying if you receive email at the pepper address, whatever characteristics you can glean from that message - be it the sender, the subject, or the origin host, you'd add it to a db your regular mail account(s) would use to reject mail.

Of course, you need to know what to extract from the peppered message. Blocking an ISP outright because someone forgest them in the From: would be bad.

Over the past few days, I have been collecting the addresses and domains
in a database.  As I suspected, there doesn't seem to be much reuse of
email addresses but the domains are more interesting:


So the idea is that orgio.net, with 9 spams, last one on 16th May, puts
itself into the blacklist and stays there until the last spam is, say,
over a month old.

The problem is with automatically determining the validity of the spam source - it's _trivial_ to forge the envelope sender.

And maybe email addresses are recycled over time as well - I can just wait
and see.

I don't bother to block specific addresses (except as _actual_ user twits), but rather use patterns - friend@ for instance.

Extending this to IP numbers would be interesting.  I'd have to hack these
out of the received headers I guess.  How careful would I have to be?  Is
the earliest header always the one furthest into the email?

Unless it is forged, which isn't particularly uncommon with spam. Also, some mailing lists (which presumably your pepper address woulnd't be _validly_ on), discard the received headers up until the mailing list reception of the message (for a while, the procmail list was doing that, which was extremely annoying as a lot of spam would have been avoided otherwise), so the IP chain will be incomplete - if you ID something as spam and add the apparent origin to a db, you're potentially shooting yourself in the foot.

Also, I'd also be interest in comments on which "from" header to use.
Currently, I'm using the formail -t header but have just started another
database this evening using the FROM_ header.  There doesn't seem to be
much point in using the formail without_the_t as the spams seem mostly
have a Reply-to: header in them.

Most spams I've seen in recent years don't even have a valid address - either the From: is totally bogus (or is a tossaway freemail account), and at best, the Reply-To: is some removal service (which you'd be fool to actually _send_ anything to in any event), which is usually also on a freemail service and with any luck is closed out within hours of the spam incident.

I personally reject as forged any message from a handful of the big freemail services if the From: header and the Message-id con't correlate in an acceptable fashion. Sure, I toss a certain amount of legit email (people using the From: on regular email passed through their own ISP's mail server), but as yet, I have't misplaced anything that I was concerned about.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>