Re: [Asrg] Building a better blacklist

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dan Oetting wrote:

Why are the performance figures for blacklists so low? I saw someone 
post a figure of 80% blocking. While less than 50% of the spam I get  in
my unfiltered accounts would have been blocked by SBL+XBL. Why  can't
the blocking rate be in the high 90's?


Because, quite simply, a significant amount of spam comes from "mixed
sources", such as ISP mail servers, so, unless you're willing to put up
with very high FPs, DNSBLs are simply not suitable for _that_ segment of
spam.

Effective spam fighting requires a hybrid of techniques.  DNSBLs are for
when the IP clearly shows that it's compromised or "malicious" in some
fashion, and accepting _anything_ from it is a bad idea, hence blocking
everything from it results in zero false positives (from that source).
Other filters are suitable when that's too blunt an instrument.

Hotmail's webmail servers were one of the biggest individual sources of
spam on the Internet (95%+ advance fee fraud at one point).  Unless
you're willing to nuke all of their email, DNSBLs are the wrong instrument.

[I'll point out that until a few weeks ago, that's exactly what we had
to do - block about 1/3 of all of hotmail's active webmail servers...
We're still 100% blocking tin.it's mail servers.]

I realize that it's not  possible
to hit 100% blocking unless every mailbox were a spamtrap  feeding the
blacklist. But with a significantly large trap farm it  should be
possible to detect almost all spam sources within a  reasonable time.


I work with VERY large spamtraps.  The one I personally operate (6-7
million emails/day) is perhaps less than 1/100th of that.

The intelligence this derives is _exceedlingly_ useful in catching the
majority of spam that is amenible to DNSBLs. But the nature of the spam
beast is that _some_ of it isn't amenable to DNSBLs.  Unless you have a
far higher tolerance for FPs than most.

The other big issue is why don't more ISPs use blacklists?


A lot more ISPs and anti-spam vendors are using DNSBLs than you think.
Most of them simply don't tell you that they are.  In many cases, the
DNSBLs aren't being used for definitive yes/no blocking, they're used in
conjunction with scoring systems.

The main 
concern I suppose is that they can't afford to loose mail sent to  their
customers. To address this, a blacklist systems could be  designed to
recover automatically when the spam stops.


Many DNSBLs already do this, virtually all of the reputable ones do.

If the mail is  rejected with
a 4xx response code the non-spam mail from legitimate  ISPs would be
delivered (only slightly delayed) once the spam is  cleaned up.


That's presuming that "spam is cleaned up" in a sufficiently timely
fashion to fall within retry limits.  It seldom is.  Indeed, most spam
simply doesn't retry, so, issuing a 4xx response the first time  you see
a particular spam is a highly useful technique.  It's called greylisting.

That leads to the question of how to clean up the spam in real time. 
Since the spam traps have already captured samples of the spam 
emanating from the blocked ISP, it should be easy enough to construct  a
profile or signature of the spam that the source ISP could use to 
quarantine the remaining spam in the queue.


Some ISPs have done this in the past, however, it's seldom an effective
or generalized enough solution.  Furthermore, most spam _never_ sits in
queues, because it's not coming from mail servers.

The key to blocking a significant amount of spam is identifying whether
the thing sending the email to you really is a mail server.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQCVAwUBRC6rOJ3FmCyJjHfhAQJAvgP7Bnqsx5yluJ+7NBFbghD/T5VB3O9xXDVv
lqYrbzKpgfbT5gnybOOhVHOMw4LB6umB8/yUclfDNse2jeBjzEtgq/9Wsdbw+uz+
1qzCe1b3tE/j6uMwAreAomuXAcoAJPz1TW5lTXw9a84uXdGa2o3bvE06HCBU1tRc
4kO5AnDS5nE=
=x7L+
-----END PGP SIGNATURE-----

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg