On Apr 1, 2006, at 9:32 AM, Chris Lewis wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dan Oetting wrote:
Why are the performance figures for blacklists so low? I saw someone
post a figure of 80% blocking. While less than 50% of the spam I
get in
my unfiltered accounts would have been blocked by SBL+XBL. Why can't
the blocking rate be in the high 90's?
Because, quite simply, a significant amount of spam comes from "mixed
sources", such as ISP mail servers, so, unless you're willing to
put up
with very high FPs, DNSBLs are simply not suitable for _that_
segment of
spam.
I found a combination that would block nearly 100% of the spam I
receive. As I said earlier, the SBL+XBL would block 50% of the spam I
see. The DCC Reputation system would block the other 50%. An
interesting observation is that I see no overlap between the two.
Either every commercial DCC server is protected by the SBL+XBL or my
ISP is blocking on the combined score. Only one spam source that hit
my account had a mixed reputation on the DCC list and was not listed
in the blacklists.
We're still 100% blocking tin.it's mail servers.
That's the one.
The main
concern I suppose is that they can't afford to loose mail sent to
their
customers. To address this, a blacklist systems could be designed to
recover automatically when the spam stops.
Many DNSBLs already do this, virtually all of the reputable ones do.
The DCC Reputation system is almost exactly what I had been seeking
for the last four years. It's even got the distributed spamtraps and
a flood distributed update network, Except that it's cycle time is on
the order of a few days instead of a few hours. And it's a closed
system, only available to the commercial licensed sites.
The DCC Reputation doesn't appear to use any weighting based on the
reliability of the spamtraps. I feel that the field of traps needs to
be very diverse which means that some of the traps are going to get
more noise than others and this should be taken into account when
deciding if a spam threshold has been reached. Also, releasing the
exact hit counts for a source address is dangerous and could lead to
simple search attacks to discover the trap addresses.
If the mail is rejected with
a 4xx response code the non-spam mail from legitimate ISPs would be
delivered (only slightly delayed) once the spam is cleaned up.
That's presuming that "spam is cleaned up" in a sufficiently timely
fashion to fall within retry limits. It seldom is. Indeed, most spam
simply doesn't retry, so, issuing a 4xx response the first time
you see
a particular spam is a highly useful technique. It's called
greylisting.
That leads to the question of how to clean up the spam in real time.
Since the spam traps have already captured samples of the spam
emanating from the blocked ISP, it should be easy enough to
construct a
profile or signature of the spam that the source ISP could use to
quarantine the remaining spam in the queue.
Some ISPs have done this in the past, however, it's seldom an
effective
or generalized enough solution. Furthermore, most spam _never_
sits in
queues, because it's not coming from mail servers.
Who is worried about the non-mail server sources of spam? They will
be listed for as long as they continue to try and deliver spam.
It's the "significant amount of spam comes from ""mixed sources"",
such as ISP mail servers" that needs to be considered.
What I am proposing is a fast response (greylist) advisory that
allows recipients to delay the acceptance of mail from the listed
ISPs while signaling the listed ISP's in real time that they have a
problem and to let the listed ISPs clean up the problem at the
source. Then releasing the advisory when spam is no longer detected
so the mail will again flow freely from the now clean source. In
cases where the source ISP doesn't clean up in a reasonable time the
recipient ISPs would fall back on their own filtering.
Unlike the typical blacklist, this advisory list can afford to have a
hair trigger because the penalty of a false listing is at most a
short delay in mail delivery. There is still a tradeoff to balance
this penalty with the advantage of stopping more spam. Whitelists
based on sender and recipient could circumvent the delay and further
minimize the negative impacts.
One issue is how to signal the sending ISP in real time to tell them
exactly what their problem is. [If this was the single ultimate
solution to the spam problem then every ISP would subscribe to the
advisory and the problem would be solved. :^} ] Realistically, an
ISP that is accepting (or quarantining) mail during the advisory
could return a specific response when, for instance, the mail is
detected as bulk by the DCC. The Sending ISP upon receiving a number
of these responses for mail from the same local user would know that
the user is probably misbehaving and could take appropriate
corrective action.
What is primarily gained by this proposed advisory list is time to
make a better determination on wether to block the listed ISP or not.
And time to build a better signature of the bulk mail. The spam that
would have come out in this time interval can be kept out of the
recipients mailboxes.
-- Dan Oetting
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg