Re: [Asrg] Opt-out lists and legislation
2003-03-11 10:04:15
At 2:49 AM -0500 3/11/03, Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu wrote:
On Mon, 10 Mar 2003 18:39:57 EST, Kee Hinckley said:
I currently have a sample database 22,000 confirmed spam messages
sent to roughly 200 real email accounts.
40% blocked by the country restriction.
4% blocked due to obvious viruses.
14% blocked due to system blacklist.
<1% blocked by user blacklists.
There's less than three percent overlap between those factors. The
Actually, there's a hidden assumption here that means that there's
a lot MORE than 3% overlap. Your 14% system blacklist refers to a
blacklist that was tailored thinking "and this list doesn't include
anything from .XY because we country-restrict them already".
Yes and no. Country restrictions are per-user, so the blacklist
needs to be multi-national. On the other hand, it's generated
primarily based on spotting spam that is getting through from fixed
sources, and that comes from user feedback. So there is a bias
towards countries that people usually receive email from.
What percent of mail was tagged with the country restriction but *NOT*
tagged as spam by users? (For instance, it would be quite easy to flag
Initially the false positive rate starts a little high until the user
tunes their filters by specifying which countries they regularly get
email from. They can also approve a sender even if we are blocking
the sender system wide--user rules win.
all mail from .CN as spam - and although my users would probably tag back
100% of the spam from .CN, they'd not tag 100% of the mail from .CN, as
They'd okay mail from China, and then possibly have to okay senders
from some Chinese ISPs on a per-user basis. (E.g. I'm not sure, but
I suspect that 163.net is on our blacklist, even though they are
legit ISP.)
Is the "user blacklist" number the percentage caught by pre-established
user filters, or is that saying that your other checks were 99% effective
in identifying spam and only 1% got through to users for them to report?
Although users can pre-establish a blacklist, they tend not to.
Instead we let them blacklist a sender at the time they report a
false negative (spam got through). 1% is the number of subsequent
messages blocked by those blacklists.
I've just spent a week or so changing the interface to that part of
the system. The previous interface pretty much set up blocking the
sender as the default when you reported spam. Now it's a secondary
choice, because the fact of the matter is that blacklisting the
sending address almost never works. The next email from the spammer
uses a different address. So there's no point in filling your
blacklist with fake addresses. (Or is it "might have existed by
don't exist now" addresses :-). Instead we want to reserve
blacklists for blocking email from real people (or domains) that you
*really* don't want. So the primary action now is to report the
problem so that we can look at it and figure out why we didn't block
it. We've also added "unsubscribe" detection. If the message looks
clean, but you think it's spam, we'll try unsubscribing you from it.
A lot of the non-technical folks can't tell a commercial list that
they accidentally subscribed to, from a spam message pretending to be
a list. If we can get them off the list, we've done everyone a
favor. We'll track responses to those of course--if they keep
getting mail from the mailing list then we can put them on the
blacklist.
Do you have any guesstimates of how much *unreported* spam got through
to the 200 accounts?
This gets back to the problem I mentioned earlier on the list. You
can't trust users to check the email sitting in their blocked queue.
Because we are in (public) beta, our customers have been pretty good
about trying to report everything, and we try and make reporting real
easy (click on a URL in the message header). My (non-scientific)
observation is that if they let all mail go through to their MUA, and
filter there, they are better at reporting false negatives (spam in
their inbox) than false positives (good stuff in their spam box). If
they leave the spam on the server then we get the false positive
report automatically because they can either "send" or "approve" the
message. "Sending" it doesn't notify us, and it doesn't change their
rules. "Approving" whitelists the sender and sends all messages
currently held that are from that sender. *That* we get notification
of.
Basically, I feel pretty good about our current stats because we know
most of the beta testers and they know that they are providing us
with useful information in exchange for free spam blocking. As we
get into real users I expect that we'll see the accuracy drop off,
especially after they've used the system for a while.
--
Kee Hinckley
http://www.puremessaging.com/ Junk-Free Email Filtering
http://commons.somewhere.com/buzz/ Writings on Technology and Society
I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
RE: [Asrg] Opt-out lists and legislation, Kee Hinckley
RE: [Asrg] Opt-out lists and legislation, Hallam-Baker, Phillip
RE: [Asrg] Opt-out lists and legislation, Hallam-Baker, Phillip
RE: [Asrg] Opt-out lists and legislation, Jonathan Wilkins
RE: [Asrg] Opt-out lists and legislation, Hallam-Baker, Phillip
RE: [Asrg] Opt-out lists and legislation, Hallam-Baker, Phillip
|
|
|