Re: [Asrg] NXDOMAIN cache behavior, was draft-levine-iprangepub-01

On 1/5/2011 8:07 PM, Steve Atkins wrote:


On Jan 5, 2011, at 4:00 PM, Chris Lewis wrote:

Rsync is essentially the defacto standard for bulk DNSBL transfer, and as you say it's 
"not awful".  So, we don't seem to have a significant difficulty with that.


AIUI, rsync is reasonably expensive to offer (in terms of CPU load), and it 
adds significant latency to listing and delisting (as you say below). Do you 
have any numbers on system load vs users vs update rate conveniently to hand?

AIUI, rsync really isn't that bad on large zone files (with one caveat,see below), unless compression is turned on (CPU), and even then, youcan support several hundred downloaders (mixed compression andnon-compression) on reasonably modest hardware. The rsyncd can alsoforce non-compression if deemed necessary.

Of course bandwidth can also be an issue. I wouldn't want to do theabove with anything less than a 20Mb/S Internet link...

Rsync behaves quite badly, however, on large zone files if you'reupdating random locations in the zone. Eg: if you're doing time-basedexpiration, you damn well better sort the file in entry "first seen"timestamp order, so the bulk part of the adds and deletes arecontiguous. Don't change, say, existing TXT entries - once the entrygoes in, it shouldn't change until it disappears.

The performance numbers I have are from a strangeish case, so, you'llwant to get representative numbers from a more straightforward one.Like Spamhaus.

I don't think that looking at an alternate WAN distribution protocol is 
something we *have* to do, but if we're looking at changing behaviour at the 
MTA location anyway, now's a good time to think about whether we do want to do 
that or not, and what the tradeoffs might be. We're a research group, are we 
not? :)

No. Other than, maybe, zone file formats. This can encourage thedevelopment of common ancilliary tools to manipulate the zone filesbefore internal publication.

For example, as a performance improvement, we download almost all of ourDNSBLs and merge them into one zone, with locally assigned return codesfor each of the originating DNSBLs. One query, and you can get many Arecords back, each of which is uniquely identifiable (by the A value) asto which DNSBL it came from, so you can score them independently (whichwe do).

It would have been nice if the DNSBL file formats was betterstandardized. The merge code is a bit smelly. It'll work on (most)rbldnsd-compatible zones, but everything else gives the coder (me)heartburn ;-)

Even if we were to do something as simplistic as chop IPv6 queries at the /64, 
given that the number of spammers and bots doesn't magically go up simply 
because there's more bits to hide in, the caching problem appears to not that 
much worse than it already is with IPv4.


The number of spammers may not go up, but the number of addresses they use may. 
Or may not, it depends on their behaviour. But I'm not betting that it won't.

Given the experience and impressions I've acquired - at least in thebotspam space, IP hopping is actually quite rare in IPv4. The casesI've seen are actually hard to tell whether they're due to load balancedmultiple exit NATs, or real and explicit IP hopping by the bot. As it'sbeen explained to me, "real" IP hopping is usually just about impossible(on ISPs) with more modern hardware that enforces which IP you'veauthenticated to on _that_ wire, and you lose connectivity altogether ifyou change your IP (and this is even _within_ the subnet your local ISPswitch is operating in).

If each IPv6 user is getting a, say, /64 for their allocation, but theISP hardware capabilities don't change, they can easily change their IPwithin the /64, but not outside of it. Hence, if we fix the DNSBLcutpoint at the same level, they can't hop far enough to get out fromunder the listing...

This doesn't necessitate that the cutpoint is 100% applicable, only thatit's the majority.

Mind you, if IPv6 is so plentiful, snow-shoers will likely be able toget far more IP blocks than they can now, and get harder to block.

I'm really beginning to think that it's still quite premature to do muchin this space. We have to wait and see how the environment evolvesbefore we can predict what will work. And as you say, it's probablystill quite some time off before it matters.

It's nice to see public experiments in IPv6 DNSBLs, but, I really haveto wonder about the usefulness of it at the present time. Of thehundreds of DNSBLs that exist, only a dozen or two block that much ontheir own. Couple that with the dearth of IPv6 MTAs, I can't helpthinking that such DNSBLs are essentially useless except asproof-of-concept.

1) Some mechanism for CBL/XBL single-IP DNSBLs to remain useful (eg: hardcoded 
/64 truncation or some mechanism like John's) for Internet query from small 
sites.

I think this bit is really tricky, and has lots of possibilities for 
over-thinking the problem. Blocking by /64 seems both a good idea, and probably 
good enough for small sites.

It strikes me that it may well be advantageous to pick an arbitrarycut-point and encourage even the medium to large sites to continue usingit at that level. You'd be building an additional (but small) factor ofpossible FPs into it, giving the hosters more incentive to clean up thebad IPs faster, because they may well have someone else yelling at them.

Consider the following thought experiment: if the XBL rounded off everyentry to a /31 (in IPv4), every listing has a (actually quite small)possibility of blocking an innocent IP that someone will be impacted by.I think providers would have a disproportionally higher incentive tofix the bad IP. But the real FP impact would be extremely low.

2) Zone download (Rsync or perhaps something better) becoming more prevalent.

Yup. I think zone download is going to become more prevalent (to the extent 
that I think even rsync is going to get too painful to offer in some cases, and 
something that looks more like IXFR is worth another look).

I thought IXFR was abandoned because it was too expensive. Or perhapsthat support was pretty poor. AXFR is obviously out of the question.

While the tradeoff volumes for query versus zone downloads/incrementals may 
well shift, it will just about be never advantageous for small sites doing a 
few dozen emails per day to take a whole zone of something as big as the XBL.


No, it wouldn't. (Then again, there's no such thing as a site doing a few dozen 
emails a day, I don't believe.


Sure there are servers that small.  Mine - it does even less ;-)

The spam alone is going to be a couple of orders of magnitude more than that :) 
).


Spammers haven't found it yet.  Open relay probers have tho.

  Besides, in many cases, that introduces latency delays.


It only introduces latency delays because rsync is not a good protocol for it

Rsync's batch-ish nature isn't quite why I mentioned latency. Thelatency is zone rebuild time at the DNSBL server. With a zone file aslarge as the XBL, say, even if you did some sort of incremental updatedirectly off a core DBMS in realish-time, I think you'd run intoperformance problems.

I've been doing experiments aimed towards UDP-based DNSBL adds, and eventhat starts to get nasty at the levels I think major DNSBLs would see.

I know of systems that can get "bad IPs" from detectors thru tomoderately large numbers of clients within a handful of seconds. But Idon't know how big the system is (clients nor magnitude of bad IPs), noram I confident of its stability in a heterogenous client environmentwith multiple code authors. These are commercial for-pay environments,who are obviously much smaller audiences than the good "free" ones.

_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg