Off-topic: mydnsbl (my "too many failures BL") moving from investigation to testing
2005-01-16 16:06:31
This doesn't have to do with SPF, but may be interesting to some folks
here. If you are interested in more info, please reply to me off-list.
The most interesting thing about this project is that I found out that
about 20% of our mail comes from about 2500 IPs which have sent us 9 out of
10 bad transactions within the last two hours. Since we get a LOT of bogus
traffic, I have been thinking of some ways we can harness the power of all
that crap for good instead of evil :)
ObSPF: It will be interesting to see which domains these "almost certainly
zombies" use to send from. If they pass SPF even from a zombie's location,
that probably means the domain should be blacklisted along with the IP. If
they fail SPF, even more reason to block that IP (especially if they forge
lots of different IPs).
This is the latest draft of "mydnsbl," which is a personal project I've
been working on. It's been about half work time and half personal time. As
of now, the software seems to work OK, and the docs are pretty complete
(for testing purposes anyway). I will soon be moving on to the more
daunting task of trying to test my new DNSBL with actual user mail, if I
can convince management that it's safe.
I'm posting here in case any of you are interested in doing something like
this at your site, and also to get feedback, comments on things I might
have missed, hints on how to phase it in and test it, etc.
(I cut this back a bit for posting... the full explanation of the project
is at <http://www.livejournal.com/users/gconnor/121193.html>)
"mydnsbl" is a script that reads syslog activity from a mail server, and
creates a DNSBL based on the "bad" activity. The idea is that I want to
keep track of the last 10 transactions from each IP, and if 9 of the last
10 transactions were user unknown, then that IP should go on a local DNSBL
for something like 2 hours.
"Bad" activity in this case is considered to be user unknown, mailing to
internal-only recipients, known spam, known virus, and basically anything
that results in a failed transaction at your mailer for reasons not your
fault :) This bad activity is offset by "good or neutral" activity, such as
delivered OK, possible spam but not sure, and anything that results in an
actual delivery.
Currently there are two pieces of the puzzle working.
http://www.nekodojo.org/~gconnor/mydnsbl/myscanner
The "myscanner" script tails a logfile where my mailservers send their
syslog output. It takes multiple lines with the same transaction ID and
puts the pieces back together, so that the output contains one line per
transaction, telling the IP and the result. This is good for mailservers
that output activity "as it happens" rather than one line per transaction.
If you can convince your mailer to output (IP,result) on one line, you
probably don't need myscanner.
Currently "myscanner" only understands Barracuda output, but a similar
framework could be used to make sendmail logs into transactional output. It
is currently highly dependent on our specific output though. (There are
similar programs or perl modules out there that produce summarized output
for Sendmail and maybe others.)
(cut most details of this piece, see
<http://www.livejournal.com/users/gconnor/121193.html> for full version)
Note that if your mail program already reports the result (Sent OK, Unknown
user, spam, unknown domain, etc) on the same line as the IP, you probably
don't need myscanner; just alter mycollect to interpret the log format of
your mailer.
http://www.nekodojo.org/~gconnor/mydnsbl/mycollect
mycollect keeps track of every IP seen so far in a hash, and with each IP
it keeps the result of the most recent 10 transactions, where "result" is
either bad, ok or wtf. If 9 of the last 10 transactions are "bad" then the
IP is added to the internal "blocked list". The current blocked list and
its expire times are kept in memory, and dumped out to a disk file every 5
minutes. The output is just a list of IP addresses, so it will work with
rbldnsd but I will eventually add a preamble and some formatting.
mycollect also keeps detailed statistics, which was its main job during the
investigation phase. I wanted to get detailed info about how many IPs would
be blocked, and how many messages that made it through would have been
blocked.
Statistics are reported to STDERR at every 100,000 transactions (if you
like) or when the program receives USR1 signal. Output looks like:
# kill -USR1 %1
#
From: Jan 16 00:46:07 To: Jan 16 02:48:48
total = 300000 (100%) (rbl = 65, ok = 1, bad = 32)
would block = 40027 (13%) (rbl = 0, ok = 0, bad = 12)
cache size = 18887, blocks size = 2609
usertime=329.18, systime=10.11, SZ:RSS=4940:4487
... (later) ...
From: Jan 16 00:46:07 To: Jan 16 06:53:16
total = 1000000 (100%) (rbl = 63, ok = 1, bad = 35)
would block = 176401 (17%) (rbl = 0, ok = 0, bad = 16)
cache size = 19346, blocks size = 2535
usertime=1374.61, systime=33.5, SZ:RSS=5344:4898
This indicates that after 300,000 transactions (about 2 hrs), 2609 IPs
would be added to the blocked list, and 40,000 of those transactions would
have been avoided, if the DNSBL had been really used. Later, after 1M
transactions, we have 2535 entries on the BL, and would have blocked
176,000 transactions (17%).
It looks like most of the mail that would have been blocked would have
resulted in "User unknown" or would have been caught by other tests anyway,
but the real test will be to compare the messages that would have gotten
through (ok) but are now stopped, to see if the current "ok" number drops
significantly.
Any feedback is appreciated. Right now it is pretty customized to my
environment, but should be pretty easy to adapt to other types of input. If
you feel like playing with it and running your own output through it,
please feel free. I would be interested to see what kind of numbers you
come up with for your site :)
Thanks for taking the time to read!
gregc
--
Greg Connor <gconnor(_at_)nekodojo(_dot_)org>
<Prev in Thread] |
Current Thread |
[Next in Thread> |
- Off-topic: mydnsbl (my "too many failures BL") moving from investigation to testing,
Greg Connor <=
|
|
|