Here's a brief update from the small but highly motivated A&C
subgroup.
There are multiple experiments underway, at various stages of
completion. A&C members are looking at where spam comes from,
comparing specific characterization/filtering technologies, and in
general trying substitute empirical data for opinion on almost
anything related to spam.
Two recent experiments have now borne fruit. One's interesting
enough, though not exactly Earth-shattering; the other is much more
portentious.
1) We undertook an experiment to address the
seemingly commonplace belief that "different
people get different spam." For purposes of
this experiment, we analyzed two large
samples of spam (several-K relatively recent
messages, gathered independently), as well as
multiple smaller samples.
Result: Within nominal limits, everyone gets
pretty much the same spam. While there *may*
be such a thing as "targeted" spam (which
might result from, say, subbing to a
particular list or ordering a particular
product), the volume of that spam is dwarfed
by the "shotgun" spam that everyone gets.
2) We undertook an experiment to address the
equally commonplace belief that "spam is
volatile," and that spam and spammer tactics
change rapidly. This experiment was based on
analysis of about 2,500 spams accumulated
over a period of 2.5 years.
Result: "Glacial" maybe, but "volatile"??
The closest we've been able to come to
identifying "volatility" is a kind of
"punctuated equilibrium" model. Spam does
change over time, but *very* slowly.
We hope to present the first set of results
from this experiment at the MIT spam
conference in January. We're also looking at
conducting a follow-on experiment, and, if
the results cooperate, envision a submission
to CACM early next year.
3) Though not an experiment _per se_, we're
currently trying to knock down testable
theories about why any two addresses that are
*seemingly* nearly-identical in terms of
visibility get gob-smackingly different amounts
of spam. Several obvious possibilities were
refuted very quickly, while others have not yet
been refuted. (On behalf of A&C, I'd like to
thank Scott Nelson for recently taking the time
to help us refute one candidate theory.)
(This latter effort may not sound like much,
but unless we can eliminate that "pre-existing"
variance, or at least learn enough about it to
factor it out, any sort of fruitful/meaningful
real-time volumetric study is probably
impossible.)
(We now return to regularly scheduled ASRG content.)
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg