On Mon, Mar 04, 2013 at 03:46:14PM +0000, Martijn Grooten wrote:
Is the reason different sources use different ways to express
information the fact that there is no suitable protocol? Or is it a
mere consequence of the fact that sources have different things they
are willing and able to share?
That's a pair of great questions, and I can see reasons to answer "yes"
to both.
On the one hand: there's no standardized way to do this (beyond DNSBLs
and RHSBLs, which we've piggybacked on DNS). On the other hand, you're
right, different people are making different statements about different
entities -- IP addresses, domains, web pages, email addresses, etc. --
so *if* there existed some standardized way to express this, it would
have to let them say the same things they're saying now...because otherwise
they'd probably have no reason to use it.
So I dunno.
Perhaps you can come up with examples of where such a protocol would be
useful?
Sure. Let me show these using some pseudocode, just to illustrate the
concept. Let's presume that example.org is asking questions of example.com.
Question:
query-proto-version = 1.0
query-to = blah.example.com
query-time = Mon Mar 4 16:06:37 UTC 2013
object type = ipv4
object value = 192.168.0.3
object query = spam source?
Answer:
answer-proto-version = 1.1
answer-from = blah.example.com
answer-time = Mon Mar 4 16:06:38 UTC 2013
answer-valid-time = Fri Mar 1 13:05:00 UTC 2013
answer-expiration-time = Fri Mar 8 13:05:00 UTC 2013
answer = yes
This is the equivalent of a DNSBL check -- except that the answer
also contains two more items. It includes an "answer-valid-time",
which could be "the time that we started giving out this answer",
and "answer-expiration-time", which could be the time that this
answer is scheduled to expire. Thus the former could mean "we listed
this IP address at 1:05 PM last Friday, because that's when our sensors
told us to" and the latter could mean "unless we see a reason to
extend the listing, we're going to drop it at 1:05 PM this Friday".
Question:
query-proto-version = 1.0
query-to = blah.example.com
query-time = Mon Mar 4 16:06:37 UTC 2013
object type = URL
object value = http://example.net/some/page.html
object query = infected with malware?
Answer:
answer-proto-version = 1.1
answer-from = blah.example.com
answer-time = Mon Mar 4 16:06:38 UTC 2013
answer-valid-time = Fri Mar 1 14:05:00 UTC 2013
answer-expiration-time = Fri Mar 8 14:05:00 UTC 2013
answer = no
This is a very similar Q/A: in this case the answer is negative,
but it also has an expiration time. (Let's presume that example.com
is crawling sites at weekly intervals, thus there is no reason for
this answer to [possibly] change until the next crawl is done.
The requestor might be okay with this answer, or it might want
a more recent one -- in which case it will need to ask someone else.)
Question:
query-proto-version = 1.0
query-to = blah.example.com
query-time = Mon Mar 4 16:06:37 UTC 2013
object type = ASN
object value = 123456789
object query = hijacked?
Answer:
answer-proto-version = 1.1
answer-from = blah.example.com
answer-time = Mon Mar 4 16:06:38 UTC 2013
answer-valid-time = Fri Mar 1 14:10:00 UTC 2013
answer-expiration-time = Fri Apr 5 14:10:00 UTC 2013
answer = yes
answer-additional: http://example.com/hijacks/123456789
Also very similar. I posited a much longer expiration time because
this is probably not going to be a quickly-remediated problem. I've
also shown an addition to the answer, which in this case is just a URL
where something consumable by humans might be found.
To expand on those just a little bit: "object type" could probably
encompass things like:
IPv4/IPv6 addresses
networks (by handle?) (by CIDR?)
ASNs
domains, subdomains, hosts
URLs
email addresses
"object query" could include the examples above, and much more obviously,
but should exclude those things that we already have ways to find out,
e.g., this should not be a way to query for a DNS A record, because
that's just kinda silly.
There are two (at least two) ways to go with this: one would be to
make it concise and use UDP. Another would be to make it verbose
and use TCP. (Insert long discussion here about performance tradeoffs.)
I'm not sure that this is worth getting into unless the high-level
idea flies: if we don't actually need a standard format and a standard
protocol that uses it, then those tradeoffs don't matter.
---rsk
_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg