Re: [Asrg] A slightly revised entirely incompatible approach for IPv6 DN

I did some copy editing to make it easier to follow.

R's,
John

--- snip ---

Here's a rough draft of a new, improved idea I have to do DNSBLs and DNSWLs
for IPv6.  (It should work for v4, too, if anyone cares.)

This is basically an implementation of B-trees using the DNS as the
storage device.

Assumptions:

* Handle arbitrary mixtures of CIDRs and individual IPs.

* Must work well with existing DNS servers and caches.  The number of
  different queries should be limited, both to limit cache-server
  traffic and the number of cache entries it uses.  I assume that
  client-cache queries are cheap, cache-server queries are much more
  expensive.

* Doesn't assume senders will be well behaved.  In particular, bad guys
   may use a new IP for every message, hopping around within each /64 (or
   whatever) in which a zombie lives, and there will be many zombies active
   at the same time.

* Doesn't assume MTAs remember query history from one lookup to the
  next, so the necessary info should be available in the DNS cache.
  (If they do, that doesn't hurt.)

* Should be possible to make it work with DNSSEC, but doesn't depend
  on DNSSEC.

* No attempt to be compatible with existing DNSBLs.  I expect it will
  mostly be served from something like rbldnsd, although it would also
  be possible to have an external program create the zone files and
  serve them from something like BIND.

Each DNS record in this scheme is a blob of bytes containing multiple
self-describing logical entries, each of which is a CIDR range, which
might be a single IP if the CIDR is a /128.  In each blob, the CIDRs
are in address order from lowest to highest.  The name of each record
is the address of the entry that implicitly points to it, which is the
entry just below it.  In this context, there is no advantage to doing
the nibble reversed naming, and the name is just 32 ASCII characters,
the address as a hex number.  One blob is the tree root, which is
named 00000000000000000000000000000000.

The obvious way to represent a blob is as a TXT record, with all the
strings concatentated.  The more entries you can fit in a blob, the
better this will work, so although it would be possible for each
logical entry to be a separate string in the TXT record, I don't
propose to do that.

Each blob contains a set of CIDRs that are in the DNSxL.  For blobs
that are not leaves (there's a per-blob flag, described below), the
idea is that the entries in the blob also partition the address space,
with the CIDRs between the ones in the current blob in sub-blobs, each
named by the entries in the current blob.

To minimize edge cases, the root blob always contains the lowest and
highest entries.

So to do the search:

1. Make the root blob the current blob and fetch it.

2. Look inside the current blob for a CIDR entry that contains your
   target IP address.  If you find it, stop, the BL contains the IP.

3. If your IP is lower than the first entry in the current blob, or
   higher than the last entry, stop, the BL does not contain the IP.

4. If this is a leaf blob, stop, the BL does not contain the IP.

5. Find the entry in the current blob that is just below your IP.  Use
   the address in that entry as the name of new blob, fetch that blob,
   and make it the current one.

6. Go to step 2.

It should be evident that this is analogous to a binary tree search,
except that each node has a lot more than two descendants.

Details of blob representation:

The first byte of each blob is a flag:

   L P P P P P P P

The L bit is set if this is a leaf.  The PPPPPPP is the prefix size.
If all of the CIDR addresses in the blob have the same initial bits as
the name of the blob, which is fairly likely down toward the leaves,
this is the number of common initial bits.  The prefix bits are not
stored in the blob's entries, but are logically prefixed to each CIDR
address in the blob.

After that is some number of CIDRs:

   X S S S S S S S
   Address

X is reserved for now, perhaps allowing multiple kinds of entries.

SSSSSSS is 0-127, the CIDR size minus one.

The Address is 128-P-S bits long, rounded up to a byte.  That is, it
omits the prefix bits which are obtained from the name of the blob,
and the suffix bits beyond the CIDR size.

For example, say an entry is 2001:1234:5678:9123::/64, and the prefix
size is 16.  Then the entry would be (in hex)

3f 12 34 56 78 91 23

The 3f is the CIDR size (64-1), the 2001 is omitted since that's the
prefix, and the rest is the address.

Each blob is as big as will fit in a DNS answer.  If you don't believe
in EDNS0, the limit is about 450 bytes.  If you do believe in EDNS0,
it's whatever size you think clients ask for, probably in the 1K to 2K
range.

Estimated performance:

I'm guessing that with EDNS0 you can probably get about 100 entries
per blob.  That means a two-level tree can hold 10,000 entries, a
three level tree a million entries, and a four level tree 100 million
CIDRs, which would need a million blobs.  Large v4 DNSBLs like the CBL
have about seven million entries now, so this should be adequate.

The number of queries for any particular lookup is the number of
levels, which is unlikely to be more than four.  The cache behavior
obviously depends on both the layout of the entries and the query
pattern, but this avoids some obvious worst cases.  If a /64 is either
entirely listed, not listed at all, or just has a single /128, all
queries for addresses in that /64 will refetch the same four records.

Topics left to nail down:

* Efficiently turning a list of CIDRs into blobs.  I think I have a
  pretty good handle on this, estimate the depth of the tree, rounding
  up, and then fill the tree walking down and up from the root.

* Update management.  I think that you set the TTL to something
  reasonable, 15 minutes or an hour or whatever, and whenever you
  change the data, you leave the old entries around for one or two
  TTLs, perhaps changing their TTLs to something smaller to discourage
  caches from keeping stale data.

* Compatibility stuff, e.g., fake A and TXT record values to pass back
  to clients that expect them, probably stored in _arecord and
  _txtrecord, perhaps with a $ in the _txtrecord that expands to the
  current IP.

* Compression schemes. There might be better ways to do prefix
  compression, e.g., a per-entry field that says how many bits are the
  same as the previous entry.  For the blobs, could do bit rather than
  byte aligned entries, although I expect that would be a lot more
  work for minimal extra compression.  There may be clever tricks to
  allocate records to blobs to maximize the size of the prefix.  If an
  entire DNSWL or a blob consists entirely of /128's it might be worth
  special casing that.
_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg
Re: [Asrg] A slightly revised entirely incompatible approach for IPv6 DNSBLs