ietf-asrg
[Top] [All Lists]

Re: [Asrg] Collecting IP reputation data from many people

2010-10-21 00:06:02
I'd like your thoughts on collecting reputation data (% spam vs.
non-spam originating at every IP) from everyone willing to submit it.
[...]

You know, I really hate to be a parade-rainer.  I also don't like to
tell anyone to not try experiments.  But I see this as a basically
hopeless task in view of the botnet problem: as soon as it becomes
widespread enough to be noticed, the signal in its input data will be
swamped by gaming attempts.

You are clearly aware of that basic problem, though, so you may be able
to come up with something.  Indeed, I hope you do; anything capable of
resisting such gaming while still doing any kind of crowdsourcing is
interesting for that alone.

I would warn you against inferring from successful preliminary tests
that it would be equally successful if widespread; there are zillions
of techniques that work fine provided few enough people do them that
spammers don't notice or don't think they're worth bothering to
circumvent.  (I depend on a few myself.)

So, sure, go ahead and try it.  Even if it's a failure in the form you
now envision, something useful may end up coming out of it.  Research
is like that.

So do you think it's worth my effort?  

Dunno.  I don't think I'd put effort into it myself, but that doesn't
mean much in view of how large my to-do list of things I'd enjoy much
more is.  You sound fired up, though, and I suspect something valuable
may well come out of it even if it's not what you intended when you
started.  So I'll go with "probably" here.

How would you improve this?

I have basically no idea, and wouldn't until the experiment had
progressed significantly.

Do you think it could be useful enough that you'd be willing to click
a button to send me data occasionally?

Yes, except that (a) my MUA doesn't involve "clicking" "buttons", (b) I
doubt human-generated data from me would be of much use to you (most
spam aimed at me never makes it past my SMTP listener), and (c)
automated data from me would probably be even worse because I block a
nontrivial amount of ham because it comes from actors whose level of
abuse I am not willing to tolerate even to the point of accepting their
ham (eg, Google, Yahoo) or it is constructed in sufficiently broken
ways that I consider rejecting it a public service even if it's not
spam (eg, marked as 8859-1 but containing octets in the 0x80-0x9f
range, or with a Message-Id: that does not conform to the syntax).

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse(_at_)rodents-montreal(_dot_)org
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg