ietf-asrg
[Top] [All Lists]

Re: [Asrg] NXDOMAIN cache behavior, was draft-levine-iprangepub-01

2011-01-07 01:19:38
On 1/5/2011 8:07 PM, Steve Atkins wrote:

On Jan 5, 2011, at 4:00 PM, Chris Lewis wrote:

Rsync is essentially the defacto standard for bulk DNSBL transfer, and as you say it's 
"not awful".  So, we don't seem to have a significant difficulty with that.

AIUI, rsync is reasonably expensive to offer (in terms of CPU load), and it 
adds significant latency to listing and delisting (as you say below). Do you 
have any numbers on system load vs users vs update rate conveniently to hand?

AIUI, rsync really isn't that bad on large zone files (with one caveat, see below), unless compression is turned on (CPU), and even then, you can support several hundred downloaders (mixed compression and non-compression) on reasonably modest hardware. The rsyncd can also force non-compression if deemed necessary.

Of course bandwidth can also be an issue. I wouldn't want to do the above with anything less than a 20Mb/S Internet link...

Rsync behaves quite badly, however, on large zone files if you're updating random locations in the zone. Eg: if you're doing time-based expiration, you damn well better sort the file in entry "first seen" timestamp order, so the bulk part of the adds and deletes are contiguous. Don't change, say, existing TXT entries - once the entry goes in, it shouldn't change until it disappears.

The performance numbers I have are from a strangeish case, so, you'll want to get representative numbers from a more straightforward one. Like Spamhaus.

I don't think that looking at an alternate WAN distribution protocol is 
something we *have* to do, but if we're looking at changing behaviour at the 
MTA location anyway, now's a good time to think about whether we do want to do 
that or not, and what the tradeoffs might be. We're a research group, are we 
not? :)

No. Other than, maybe, zone file formats. This can encourage the development of common ancilliary tools to manipulate the zone files before internal publication.

For example, as a performance improvement, we download almost all of our DNSBLs and merge them into one zone, with locally assigned return codes for each of the originating DNSBLs. One query, and you can get many A records back, each of which is uniquely identifiable (by the A value) as to which DNSBL it came from, so you can score them independently (which we do).

It would have been nice if the DNSBL file formats was better standardized. The merge code is a bit smelly. It'll work on (most) rbldnsd-compatible zones, but everything else gives the coder (me) heartburn ;-)

Even if we were to do something as simplistic as chop IPv6 queries at the /64, 
given that the number of spammers and bots doesn't magically go up simply 
because there's more bits to hide in, the caching problem appears to not that 
much worse than it already is with IPv4.

The number of spammers may not go up, but the number of addresses they use may. 
Or may not, it depends on their behaviour. But I'm not betting that it won't.

Given the experience and impressions I've acquired - at least in the botspam space, IP hopping is actually quite rare in IPv4. The cases I've seen are actually hard to tell whether they're due to load balanced multiple exit NATs, or real and explicit IP hopping by the bot. As it's been explained to me, "real" IP hopping is usually just about impossible (on ISPs) with more modern hardware that enforces which IP you've authenticated to on _that_ wire, and you lose connectivity altogether if you change your IP (and this is even _within_ the subnet your local ISP switch is operating in).

If each IPv6 user is getting a, say, /64 for their allocation, but the ISP hardware capabilities don't change, they can easily change their IP within the /64, but not outside of it. Hence, if we fix the DNSBL cutpoint at the same level, they can't hop far enough to get out from under the listing...

This doesn't necessitate that the cutpoint is 100% applicable, only that it's the majority.

Mind you, if IPv6 is so plentiful, snow-shoers will likely be able to get far more IP blocks than they can now, and get harder to block.

I'm really beginning to think that it's still quite premature to do much in this space. We have to wait and see how the environment evolves before we can predict what will work. And as you say, it's probably still quite some time off before it matters.

It's nice to see public experiments in IPv6 DNSBLs, but, I really have to wonder about the usefulness of it at the present time. Of the hundreds of DNSBLs that exist, only a dozen or two block that much on their own. Couple that with the dearth of IPv6 MTAs, I can't help thinking that such DNSBLs are essentially useless except as proof-of-concept.

1) Some mechanism for CBL/XBL single-IP DNSBLs to remain useful (eg: hardcoded 
/64 truncation or some mechanism like John's) for Internet query from small 
sites.

I think this bit is really tricky, and has lots of possibilities for 
over-thinking the problem. Blocking by /64 seems both a good idea, and probably 
good enough for small sites.

It strikes me that it may well be advantageous to pick an arbitrary cut-point and encourage even the medium to large sites to continue using it at that level. You'd be building an additional (but small) factor of possible FPs into it, giving the hosters more incentive to clean up the bad IPs faster, because they may well have someone else yelling at them.

Consider the following thought experiment: if the XBL rounded off every entry to a /31 (in IPv4), every listing has a (actually quite small) possibility of blocking an innocent IP that someone will be impacted by. I think providers would have a disproportionally higher incentive to fix the bad IP. But the real FP impact would be extremely low.

2) Zone download (Rsync or perhaps something better) becoming more prevalent.

Yup. I think zone download is going to become more prevalent (to the extent 
that I think even rsync is going to get too painful to offer in some cases, and 
something that looks more like IXFR is worth another look).

I thought IXFR was abandoned because it was too expensive. Or perhaps that support was pretty poor. AXFR is obviously out of the question.

While the tradeoff volumes for query versus zone downloads/incrementals may 
well shift, it will just about be never advantageous for small sites doing a 
few dozen emails per day to take a whole zone of something as big as the XBL.

No, it wouldn't. (Then again, there's no such thing as a site doing a few dozen 
emails a day, I don't believe.

Sure there are servers that small.  Mine - it does even less ;-)

The spam alone is going to be a couple of orders of magnitude more than that :) 
).

Spammers haven't found it yet.  Open relay probers have tho.

  Besides, in many cases, that introduces latency delays.

It only introduces latency delays because rsync is not a good protocol for it

Rsync's batch-ish nature isn't quite why I mentioned latency. The latency is zone rebuild time at the DNSBL server. With a zone file as large as the XBL, say, even if you did some sort of incremental update directly off a core DBMS in realish-time, I think you'd run into performance problems.

I've been doing experiments aimed towards UDP-based DNSBL adds, and even that starts to get nasty at the levels I think major DNSBLs would see.

I know of systems that can get "bad IPs" from detectors thru to moderately large numbers of clients within a handful of seconds. But I don't know how big the system is (clients nor magnitude of bad IPs), nor am I confident of its stability in a heterogenous client environment with multiple code authors. These are commercial for-pay environments, who are obviously much smaller audiences than the good "free" ones.
_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg

<Prev in Thread] Current Thread [Next in Thread>