Hi,
having read through discussion threads here and on
spamassassin-users@apache
regarding John Levine's DNSBL implementation proposal, I have a related
question. I post separately, since my question is more general, and I dont
want to pollute these threads.
Motivation for IP-based reputation today is broader than SMTP traffic.
It is used as an resource-efficent first estimate,
wherever service providers decide on letting users use their resources
(create accounts etc) or on taking responsibility for user's content.
Even for the mail context, it seems to me short sighted, to only consider
SMTP
traffic. For example, a lot of the private-user-to-private-user mail
traffic is
generated by web-/freemailers. If their account creation process is
over-run
(because no functioning IPv6 reputation solution + CAPTCHA is broken
anyways),
outbound spam through their IPs will increase accordingly, and the DNSWL
approach discussed in aforementioned threads will tend to become
meaningless
for private-to-private email.
If you accept the above premise, a more general approach
to the problem could be subdivided into 3 parts:
1) Which entities (/128s, /64, variable prefixes..) do we assign
reputation to ?
2) What are suitable backend datastructures/implementations
for efficient lookup/updates/caching for the assumed (number of)
entities ?
3) What are suitable mechs/protocols for querying and updating/distributing
the entity-reputation data ?
In my mind, the central, totally unresolved part is 1).
If we can come up with a reasonable (see below) solution for 1), then 2)
should be solvable as well. 3), as discussed in Levine's threads, I see in
this context, as a negotiation/convention issue (with tricky tech.
problems).
Not to diminish it - but I want to make clear my point, that, if 1)
is not solved, 2) and 3) are moot - at least for blacklisting.
on 1): criteria for reasonable reputation entities
* the number of entities must be limited to something much smaller then
2^64
<- for computational cost
<- for meaning; in the end, "reputation" means judging the history of
behavoir
of a real-world communication/inter-action partner
* corollary: creating entities must incur some type of cost
<- e.g. simply rotating your hosts /64 suffix is for free
<- e.g. money for registering a domain
<- e.g. effort for an attacker to take over a host
on 1): a more concrete question
Can we manage to deduce (for traffic seen)
* the internal partitioning of ISP's (e.g) /32s into subnets
* classify each subnet into static vs. dynamic IP address assignment
* for static SNs, the end-customer prefix (length)
?
Then, I think, we'd have gone a long way to solving 1). We would assign
reputation
to prefixes, that are tied to hosts/end-customers and could, deny
dialup/dyn.IP
communication as desired (today via e.g. PBL).
2) would then amount to solving the LCP (longest common prefix) problem
for a
reasonable (order of 10^8) number of prefixes. A lot o academic work
exists for that, and router do it, too.
Any ideas ?
Regards,
--jan
_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg