Re: [Asrg] Development of an object assessment format/protocol

On Mon, Mar 04, 2013 at 03:46:14PM +0000, Martijn Grooten wrote:

Is the reason different sources use different ways to express
information the fact that there is no suitable protocol? Or is it a
mere consequence of the fact that sources have different things they
are willing and able to share?


That's a pair of great questions, and I can see reasons to answer "yes"
to both.

On the one hand: there's no standardized way to do this (beyond DNSBLs
and RHSBLs, which we've piggybacked on DNS).  On the other hand, you're
right, different people are making different statements about different
entities -- IP addresses, domains, web pages, email addresses, etc. --
so *if* there existed some standardized way to express this, it would
have to let them say the same things they're saying now...because otherwise
they'd probably have no reason to use it.

So I dunno.

Perhaps you can come up with examples of where such a protocol would be 
useful?


Sure.  Let me show these using some pseudocode, just to illustrate the
concept.  Let's presume that example.org is asking questions of example.com.

        Question:
                query-proto-version = 1.0
                query-to = blah.example.com
                query-time = Mon Mar  4 16:06:37 UTC 2013
                object type = ipv4
                object value = 192.168.0.3
                object query = spam source?
        Answer:
                answer-proto-version = 1.1
                answer-from = blah.example.com
                answer-time = Mon Mar  4 16:06:38 UTC 2013
                answer-valid-time = Fri Mar  1 13:05:00 UTC 2013
                answer-expiration-time = Fri Mar  8 13:05:00 UTC 2013
                answer = yes

This is the equivalent of a DNSBL check -- except that the answer
also contains two more items.   It includes an "answer-valid-time",
which could be "the time that we started giving out this answer",
and "answer-expiration-time", which could be the time that this
answer is scheduled to expire.  Thus the former could mean "we listed
this IP address at 1:05 PM last Friday, because that's when our sensors
told us to" and the latter could mean "unless we see a reason to
extend the listing, we're going to drop it at 1:05 PM this Friday".

        Question:
                query-proto-version = 1.0
                query-to = blah.example.com
                query-time = Mon Mar  4 16:06:37 UTC 2013
                object type = URL
                object value = http://example.net/some/page.html
                object query = infected with malware?
        Answer:
                answer-proto-version = 1.1
                answer-from = blah.example.com
                answer-time = Mon Mar  4 16:06:38 UTC 2013
                answer-valid-time = Fri Mar  1 14:05:00 UTC 2013
                answer-expiration-time = Fri Mar  8 14:05:00 UTC 2013
                answer = no

This is a very similar Q/A: in this case the answer is negative,
but it also has an expiration time. (Let's presume that example.com
is crawling sites at weekly intervals, thus there is no reason for
this answer to [possibly] change until the next crawl is done.
The requestor might be okay with this answer, or it might want
a more recent one -- in which case it will need to ask someone else.)

        Question:
                query-proto-version = 1.0
                query-to = blah.example.com
                query-time = Mon Mar  4 16:06:37 UTC 2013
                object type = ASN
                object value = 123456789
                object query = hijacked?
        Answer:
                answer-proto-version = 1.1
                answer-from = blah.example.com
                answer-time = Mon Mar  4 16:06:38 UTC 2013
                answer-valid-time = Fri Mar  1 14:10:00 UTC 2013
                answer-expiration-time = Fri Apr  5 14:10:00 UTC 2013
                answer = yes
                answer-additional: http://example.com/hijacks/123456789

Also very similar.  I posited a much longer expiration time because
this is probably not going to be a quickly-remediated problem.  I've
also shown an addition to the answer, which in this case is just a URL
where something consumable by humans might be found.

To expand on those just a little bit: "object type" could probably
encompass things like:

        IPv4/IPv6 addresses
        networks (by handle?) (by CIDR?)
        ASNs
        domains, subdomains, hosts
        URLs
        email addresses

"object query" could include the examples above, and much more obviously,
but should exclude those things that we already have ways to find out,
e.g., this should not be a way to query for a DNS A record, because
that's just kinda silly.

There are two (at least two) ways to go with this: one would be to
make it concise and use UDP.  Another would be to make it verbose
and use TCP. (Insert long discussion here about performance tradeoffs.)
I'm not sure that this is worth getting into unless the high-level
idea flies: if we don't actually need a standard format and a standard
protocol that uses it, then those tradeoffs don't matter.

---rsk

_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg