ietf-asrg
[Top] [All Lists]

Re: [Asrg] Comments on draft-church-dnsbl-harmful-01.txt

2006-04-03 10:04:22
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Levine wrote:
To stay on topic, do you accept that with your definition, the only
authority which can reliably decide consent (and therefore spamminess)
is the receiver?

Not really.  I get spam complaints all the time for mail from lists
that I know perfectly well that they signed up for and confirmed.
"Oh, I don't want that any more."  For the ones that aren't totally
redacted, my setup turns them into unsubs so they don't get any more
mail for that particular list, but I don't think it's fair to count
mail as spam if it depends on reading the recipient's mind in
real-time.

Similarly, we get complaints from receivers that we blocked certain
emails that we _know_ are spam.  Like the individual who insisted that
it was necessary for him to receive the "nortel.com abuse team" messages
(aka Bagle).  Or the people who complain about us blocking important
email from their bank... that are in reality phishes.  Or the people
who've fallen for Nigerian money laundering, "change for a fraudulent
money order" scams.

[As far as we know, we managed to catch the only one of the latter here
just before the victim wrote the cheque.]

In fact, receiver adjudication would likely _lower_ the FP rate. On
balance, I believe it's more likely that a user doesn't recognize
something he really did ask for (thru some twisty chain of branding,
lowering the FP rate) than "recognize" something he didn't ask for
(raising the FP rate).

In other words, receiver adjudication would bias the evaluation
_towards_ DNSBLs (lower FPs), not away (higher FPs).  Which, if you must
accept bias (and are evaluating in a conservative approach), is the
_wrong_ direction.

Thus, receiver adjudication has an error rate, further as I'll mention
below, likely _higher_ than the measures we're trying to test.

I'll point out that in most environments who're running multiple
techniques, FN rates of individual techniques is not relevant, except in
evaluating incremental implementation cost of that technique (cost vs.
benefits).  So what if SBL-XBL only catches, say, 80% of our spam flow?
Our other techniques fill in that gap.  Our other techniques put
together can't reach even 80%.

Testing spam filters is a really, really hard problem, for all the
reasons that people have mentioned, basically that you can't test
without perturbing the system you're testing, and you can't capture
enough state to rerun the same test more than once.  So you have to
estimate based on complaint rates from bounces and the like.

Exactly.

Except in samples that are too small to be statistically significant,
it's not possible to use "receiver adjudication" in an effective
fashion.  Not because we can't do that, but because the error rate in
"receiver adjudication" is going to be at least an order of magnitude
higher than some of the methods we're trying to test, and in fact,
likely to be strongly biased in _favour_ of of the filtering technique
(for reasons outlined above).

Most of the time the receiver is correct, no question.  Which is why we
don't deny more than an infinitessimal fraction of FP/FN complaints.

But human nature being what it is, when faced with hundreds of emails in
a folder to assess, a miniscule error rate of 1% isn't unreasonable to
expect.

Yet, we already know that the CBL FP rate on our stream is approximately
2 orders of magnitude _smaller_ (and we whitelisted them, so there's no
ongoing issues), and the SBL FP rate is usually not measurable, because
we don't see any for days at a time (out of 300,000-600,000 filtered per
day).

You can't measure the accuracy of something to .01% if the instrument
you're measuring it with is only 1% accurate.  Further, we'd find a 1%
FP rate on our filtering _totally_ unacceptable (that's 3000+/day). I'd
be looking for another job in another field if it was remotely that bad.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQCVAwUBRDFRK53FmCyJjHfhAQKCCQQAnrWXIBYdAPO36b5myopoQfV0Tt8TYJSC
5cBoYE2mq4gfXimN+C1c2UwrcLZRwfGcgfrf5YuG9k/6PGXFhQ6aKRYdYA/DNapz
sijutUEyIEzoiioTs4Y0AVe40IvxvZwTEW2LLninl6iB8htSul2EPNLdurK2ijBL
z9Ul4e9P7iM=
=pBmi
-----END PGP SIGNATURE-----

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg

<Prev in Thread] Current Thread [Next in Thread>