Re: [Asrg] IPv6 Reputation / DNSBL

Hello Jan,

From: asrg-bounces(_at_)irtf(_dot_)org 
[mailto:asrg-bounces(_at_)irtf(_dot_)org] On Behalf Of
Jan Griebsch
Sent: Wednesday, January 12, 2011 12:55 PM

Hi,

having read through discussion threads here and on
spamassassin-users@apache
regarding John Levine's  DNSBL implementation proposal, I have a
related
question. I post separately, since my question is more general, and I
dont
want to pollute these threads.

Motivation for IP-based reputation today is broader than SMTP traffic.
It is used as an resource-efficent first estimate,
wherever service providers decide on letting users use their resources
(create accounts etc) or on taking responsibility for user's content.

Even for the mail context, it seems to me short sighted, to only
consider
SMTP
traffic. For example, a lot of the private-user-to-private-user mail
traffic is
generated by web-/freemailers. If their account creation process is
over-run
(because no functioning IPv6 reputation solution + CAPTCHA is broken
anyways),
outbound spam through their IPs will increase accordingly, and the
DNSWL
approach discussed in aforementioned threads will tend to become
meaningless
for private-to-private email.

This is a problem that depends on how the web-/freemailers setup their
systems. As long as the traffic stays within the mail servers of that
company I think they could use some in-house tool to filter probably abusive
users. As soon as they send it to others their reputation is on the line and
they should act to defend it accordingly (by trying to prevent abusive users
to sign up/block email/filter outbound/etc.).


If you accept the above premise, a more general approach
to the problem could be subdivided into 3 parts:
1) Which entities (/128s, /64, variable prefixes..) do we assign
reputation to ?

If you ask me we should do it for multiple ranges. Let me explain it below:
- 2 spam messages per [time] per /128 could be "acceptable", with more block
that /128. For a good reputation a /128 should send at least XX messages per
[time] without X being marked as spam.
- 10 spam messages per [time] per /64 could be "acceptable", with more block
that /64. For a good reputation a /64 should send at least XX messages per
[time] without X being marked as spam.
- 50 spam messages per [time] per /32 could be "acceptable", with more block
that /32. For a good reputation a /32 should send at least XXX messages per
[time] without XX being marked as spam.

2) What are suitable backend datastructures/implementations
    for efficient lookup/updates/caching for the assumed (number of)
entities ?

PowerDNS with MySQL backend would work for DNS based lookups probably (it
does caching, it is efficient, it is easy to update, lookups are easy and
adding capacity is easy).

3) What are suitable mechs/protocols for querying and
updating/distributing
    the entity-reputation data ?

Querying: DNS or some http/https form is probably the easiest solution (with
DNS you could cache answers more easily).
Distributing: Depending on the solution I'd look for AXFR/zone files/MySQL
master-slave replication/CSV files. I think we'll first have to decide what
is good for 1) and 2) and let 3) depend on that answer.


In my mind, the central, totally unresolved part is 1).
If we can come up with a reasonable (see below) solution for 1), then
2)
should be solvable as well. 3), as discussed in Levine's threads, I see
in
this context, as a negotiation/convention issue (with tricky tech.
problems).
Not to diminish it - but I want to make clear my point, that, if 1)
is not solved, 2) and 3) are moot - at least for blacklisting.

on 1): criteria for reasonable reputation entities
* the number of entities must be limited to something much smaller then
2^64
   <- for computational cost
   <- for meaning; in the end, "reputation" means judging the history
of
behavoir
      of a real-world communication/inter-action partner
* corollary: creating entities must incur some type of cost
   <- e.g. simply rotating your hosts /64 suffix is for free
   <- e.g. money for registering a domain
   <- e.g. effort for an attacker to take over a host

That is why I'd look for some reputation based on multiple ranges to look
for the best action to take. If you've an ISP that has paying clients that
are hurting others with their bad reputation they are more likely to say
good bye to the clients that are hurting business. Or do you've a better
idea for this?



on 1): a more concrete question

Can we manage to deduce (for traffic seen)

* the internal partitioning of ISP's (e.g) /32s into subnets

Difficult if the ISP isn't cooperating/updating data sources.

* classify each subnet into static vs. dynamic IP address assignment

Depends on the ISP and that makes it difficult to use I think.

* for static SNs, the end-customer prefix (length)
?

Difficult if the ISP isn't cooperating/updating data sources.


Then, I think, we'd have gone a long way to solving 1). We would assign
reputation
to prefixes, that are tied to hosts/end-customers and could, deny
dialup/dyn.IP
communication as desired (today via e.g. PBL).
2) would then amount to solving the LCP (longest common prefix) problem
for a
reasonable (order of 10^8) number of prefixes. A lot o academic work
exists for that, and router do it, too.

With a good DNS based solution this shouldn't be a really big problem I
think.

Regards, Mark


Any ideas ?

Regards,
--jan
_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg


_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg