Re: [Asrg] A method to eliminate spam

Kee Hinckley wrote:

At 7:21 AM -0500 3/19/03, Daniel Feenberg wrote:

overloaded, even the most recalcitrant owners eventually close them. In
the end (the "Nash equilibrium") many sites subscribe to a black hole
list, nearly all open relays are closed, and there is no need for
universal agreement to get to that end. It may take a while though.

Why do you think it hasn't happened already. Those lists have beenaround for years. Have open-relays significantly decreased?


As a percentage of spam?  Absolutely.

Total? Well, given that spam itself is exponentially increasing, I'mnot sure whether we can measure that, especially since open proxy/sockshas become the technique de-jour. But the numbers I'm going to showbelow are suggestive that open relay is nowhere near the problem it oncewas.

Your message prompted me into doing something I should have done for awhile - wire in individual blacklist effectiveness into our metrics.

And here are the numbers for the past week - these are based onrecipient counts, not message counts.

The first table talks exclusively about the results of our spamtrap, andshows relative effectiveness of the blacklists on a "pure spam" feed.

The second table talks exclusively about the results of the mailaddressed to our real users.


The individual lists are annotated when they first appear.

Numbers are counts for the corresponding entry, and percentage of totalemail received.


Blacklist effectiveness spamtrap only:

BOPM            3666774  50.73 (open proxy/socks)
Flonetwork          233   0.00 (Flowgo/dartmail/doubleclick static list)
IP, NOT BL       101140   1.40 (local "hard" manual blacklist,
                                being phased out)
MONKEYPROXY     4579195  63.36 (open proxy/socks)
NTblack          905852  12.53 (local automated proxy/socks/relay [+])
NTmanual         326783   4.52 (manual blacklist, new version)
OBproxies       1459108  20.19 (proxies/socks)
OBrelays         462877   6.40 (relays)
OK                   42   0.00 (whitelist)
OSinputs         836741  11.58 (Osirus relays)
OSproxy          136594   1.89 (Osirus proxies)
OSsocks         1798424  24.88 (Osirus socks)
SBL              562940   7.79 (SpamHaus spamsource BL)
TOTAL           7227413 100.00
TOTAL BLOCK     6063477  83.59 (total would-be blocked by blacklists)


Blacklist effectiveness on real email:
BOPM             100635   5.34
CONTENT           54802   2.91 (non-IP based filters, not used
                                on spamtrap)
Flonetwork         6096   0.32
IP, NOT BL        34946   1.85
MONKEYPROXY      135285   7.17
NTblack           38608   2.05
NTmanual          30370   1.61
OBproxies         46420   2.46
OBrelays          17419   0.92
OK                 5330   0.28
OSinputs          31922   1.69
OSproxy            2121   0.11
OSsocks           54144   2.87
SBL               51825   2.75
TOTAL           1885655 100.00
TOTAL BLOCK      316567  16.79 (total blocked)

As you can see, relays are quite low. Notice how monkeyproxy and BOPMboth trap more than 50% of all inbound spam (to the spamtrap, which isby definition 100% spam - bounces and viruses are already stripped out).

Notice how the blacklists catch 84% of _all_ spam. Pretty darn goodactually. But not perfect. That's why we do content-based too.

My guess is that too many people are reluncant to use them. As has beendiscussed here, black hole lists have a reputation for lack ofaccountability.

They have a reputation for that, but that's largely false. BOPM, OB*(these two are private lists, but you'd know who it was and how tocontact them if you ever hit a OB* blacklist block), MONKEYS[*], OSIRUSand SBL have _excellent_ reputations, and goodaccountability/contactability.

If automated they have a serious problem with falsepositives.

This is what the reputation is, but it's pure nonsense. While it istrue that "open relay" blacklists have a higher percentage of falsepositives than the others, the numbers are still _extremely_ low.Secondly, the automated testers are the most accessible ones for fixingof false positives. ORDB is probably the very best of the group -instant delist with subsequent retest and relist if necessary.

[We can't use ORDB, because we have to do zone transfers, and ORDBdoesn't permit that.]


And I can show that from the above tables.

First a comment on the "OK" entry. Our procedure for a false positiveon a blacklist of any kind causes us to immediately enter a whitelistentry, and queue up a retest to each of the blacklists (whereappropriate) for retests. Automated-almost-to-a-single-keystroke process.

[We immediately whitelist, because our DNSBL implementation is byzone-transfer and DNS zone file build. The average latency for a 3rdparty delist via these mechanisms can be well in excess of 24 hours.]

Furthermore, many of these whitelist entries are for whole ranges we doin our local blacklist (like 200.148/16 and 200.158/16), and we've justopened up a hole for the _only_ legit mailer in the whole block. [%]

What we don't have right at the moment, is a mechanism for stripping outwhitelist entries once the original blacklist entry disappears. I'mworking on it, I'm working on it ;-)

So, the "OK" entries are _every_ mail server we've ever whitelisted,despite the fact that the original blacklisting entry has probably longdisappeared - so, the "OK" entries are considerably _higher_ than ourblacklists would actually block. Further, many of them are not fromthird party blacklists, but rather from our local listings. Only 42 forthe spamtrap. .28% for the production mail. If I were presently ableto remove the whitelist entries for the machines no longer open, thenumbers would be probably be under .01% for our production systems too.


We get less than 5 false positive reports on average per day.

Spot checks show that at least 95% of all whitelist/retests we've issuedhave taken effect on the corresponding 3rd party blacklist. Exceptmonkeys[*]

But again, it's true that open relay blacklists have higher falsepositive rates. Despite being responsible for perhaps 3-4% of all ofour IP-based blocks, somewhat more than half of our IP-based falsepositives are with open relay blacklists. And most of those are withOBrelays.


Why is that?  Simple:

1) machines that were open relays are more likely to have been intendedto send email than a simple open proxy or socks server, so, "legit"users are more likely to hit a blacklist entry. Most open proxy orsocks hits are _not_ mail servers and were never intended to be. Sonobody notices. Nobody cares either (except the spammer, but they don'tnotice).

2) Lesser used blacklists have higher FP rates, because fewer legitsenders hit them. OBrelays is only used by two sites: us, and itsmaintainer. Despite being _large_ (OB is > 30 million mail addresses),it's still small compared to the coverage of the other lists, hence therelatively higher FP percentage.

3) Most of the open relay FPs are servers that are no longer open butdidn't have enough BL coverage to notice. Most of the open proxy/sockshits are servers that are still open.


What does this all mean?

Well, what Joe said - perhaps our "filtering BCP" should _explicitly_state that all mail filtering systems should be using well known andreputable open relay and open proxy/socks blacklist.

In this way we encourage much greater coverage, so that (a) site ownersfind out much quicker they have a problem and (b) stale entries arecleaned up much faster. In other words, list accuracy is vastlyimproved, and broken servers are fixed much faster. Open proxy/socksblacklist usage is already "best practise" with IRC servers. See theBOPM web site.

If manual they cost money. While individuals may have somedegree of tolerance for false positives, most companies and ISPs are notso tolerant--all it takes is one bad instance and you're all over thepress (college admissions notifications blocked, Mac.com blocking domainrenewal emails...).

Look at the above numbers, and remember who we are. Obviously, we'reVERY intolerant of false positives. We're doing fine.

[*] I have an issue with MONKEYSPROXY because the criteria for removalisn't "just fix the open socks or proxy and ask for retest" - becauseasking for the retest has other extraneous requirements. In effect, aMONKEYSPROXY entry either means you have an open proxy/socks, OR, youmay simply not have been able to formulate a retest request that MONKEYSwould accept We can't do third-party retest requests with MONKEYS, forexample.

This does not seem to cause _us_ much trouble in practise (since wewhitelist), but if you're high volume like us and not activelywhitelisting like us, it may make you think twice about using it,despite how good it is. I'd rather it followed the BOPM or ORDB modelhere. Still and all, I think we've gotten 5 false positive reports forMonkeys in 3 months.

[+] automated testing is triggered by at least 3 spam-in-hands in a dayhitting our spamtrap, one week minimum testing interval. 3 week norepeat expiration. Ignored/not tested/listed if IP already blacklistedelsewhere. Allows us to automatically detect "new" openrelays/proxies/socks hitting the spamtrap and publishing blacklistentries to production servers. Experimental. May be decommissioned.

[%] 1000+ IPs spewing email from us from a /16, and 98%+ of them arealready listed as open relays/socks/proxies. The rest of them arebehaving as if they are. Sigh.


_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg