Re: [Asrg] Opt-out lists and legislation

At 2:49 AM -0500 3/11/03, Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu wrote:

On Mon, 10 Mar 2003 18:39:57 EST, Kee Hinckley said:

 I currently have a sample database 22,000 confirmed spam messages
 sent to roughly 200 real email accounts.

 40% blocked by the country restriction.
   4% blocked due to obvious viruses.
 14% blocked due to system blacklist.
 <1% blocked by user blacklists.

 There's less than three percent overlap between those factors.  The


Actually, there's a hidden assumption here that means that there's
a lot MORE than 3% overlap. Your 14% system blacklist refers to a
blacklist that was tailored thinking "and this list doesn't include
anything from .XY because we country-restrict them already".

Yes and no. Country restrictions are per-user, so the blacklistneeds to be multi-national. On the other hand, it's generatedprimarily based on spotting spam that is getting through from fixedsources, and that comes from user feedback. So there is a biastowards countries that people usually receive email from.

What percent of mail was tagged with the country restriction but *NOT*
tagged as spam by users? (For instance, it would be quite easy to flag

Initially the false positive rate starts a little high until the usertunes their filters by specifying which countries they regularly getemail from. They can also approve a sender even if we are blockingthe sender system wide--user rules win.

all mail from .CN as spam - and although my users would probably tag back
100% of the spam from .CN, they'd not tag 100% of the mail from .CN, as

They'd okay mail from China, and then possibly have to okay sendersfrom some Chinese ISPs on a per-user basis. (E.g. I'm not sure, butI suspect that 163.net is on our blacklist, even though they arelegit ISP.)

Is the "user blacklist" number the percentage caught by pre-established
user filters, or is that saying that your other checks were 99% effective
in identifying spam and only 1% got through to users for them to report?

Although users can pre-establish a blacklist, they tend not to.Instead we let them blacklist a sender at the time they report afalse negative (spam got through). 1% is the number of subsequentmessages blocked by those blacklists.

I've just spent a week or so changing the interface to that part ofthe system. The previous interface pretty much set up blocking thesender as the default when you reported spam. Now it's a secondarychoice, because the fact of the matter is that blacklisting thesending address almost never works. The next email from the spammeruses a different address. So there's no point in filling yourblacklist with fake addresses. (Or is it "might have existed bydon't exist now" addresses :-). Instead we want to reserveblacklists for blocking email from real people (or domains) that you*really* don't want. So the primary action now is to report theproblem so that we can look at it and figure out why we didn't blockit. We've also added "unsubscribe" detection. If the message looksclean, but you think it's spam, we'll try unsubscribing you from it.A lot of the non-technical folks can't tell a commercial list thatthey accidentally subscribed to, from a spam message pretending to bea list. If we can get them off the list, we've done everyone afavor. We'll track responses to those of course--if they keepgetting mail from the mailing list then we can put them on theblacklist.

Do you have any guesstimates of how much *unreported* spam got through
to the 200 accounts?

This gets back to the problem I mentioned earlier on the list. Youcan't trust users to check the email sitting in their blocked queue.

Because we are in (public) beta, our customers have been pretty goodabout trying to report everything, and we try and make reporting realeasy (click on a URL in the message header). My (non-scientific)observation is that if they let all mail go through to their MUA, andfilter there, they are better at reporting false negatives (spam intheir inbox) than false positives (good stuff in their spam box). Ifthey leave the spam on the server then we get the false positivereport automatically because they can either "send" or "approve" themessage. "Sending" it doesn't notify us, and it doesn't change theirrules. "Approving" whitelists the sender and sends all messagescurrently held that are from that sender. *That* we get notificationof.

Basically, I feel pretty good about our current stats because we knowmost of the beta testers and they know that they are providing uswith useful information in exchange for free spam blocking. As weget into real users I expect that we'll see the accuracy drop off,especially after they've used the system for a while.

--
Kee Hinckley
http://www.puremessaging.com/        Junk-Free Email Filtering
http://commons.somewhere.com/buzz/   Writings on Technology and Society

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg