ietf-asrg
[Top] [All Lists]

RE: [Asrg] Re: Spam send/receive ratio

2004-05-22 19:03:20
No what I was saying is that using the statistics of mail
volume to try and
pinpoint a spammer is a bogus test.

  When used alone.  I never said it should be used alone.  In fact, I
said the exact opposite.

I never said you did. alone its useless, combined with other "triggers" its
still useless.
All I said is that no matter how it is used "Mail Volume" is unreliable to
the point it should not be considered.

Perhaps you could quieten this critic if you explain how Mail Volume can be
used to *reliably* detect spam.
I have tried my best to explain why it would fail. surely a counter point is
required at this stage.

Anti Spam systems that chuck out the baby with the bathwater should not be
considered as acceptable(IMHO).

As a mail user I would prefer to skim through a few spam mails KNOWING I
haven't lost anything important, than getting no spam and missing out on a
vital business/personal message.

You include another example V*gra

If you were a medical student discussing some of the finer points of V*gra
with a collegue, would you not wish for your mail to go through. or should
the word be censored from this moment forward.


Chris



-----Original Message-----
From: asrg-admin(_at_)ietf(_dot_)org 
[mailto:asrg-admin(_at_)ietf(_dot_)org]On Behalf Of Alan
DeKok
Sent: Saturday, 22 May 2004 10:09 PM
To: asrg(_at_)ietf(_dot_)org
Subject: Re: [Asrg] Re: Spam send/receive ratio


"Chris" <asrg(_at_)rebel(_dot_)com(_dot_)au> wrote:
No what I was saying is that using the statistics of mail
volume to try and
pinpoint a spammer is a bogus test.

  When used alone.  I never said it should be used alone.  In fact, I
said the exact opposite.

Now your good intentioned MTA say "hey I got mail from xyz,
I'll check his
volumes". and it comes up with "normal volume = 20, last hours volume =
5000". "Ooops" the MTA says "Spammer alert" and tosses my email.

  Then it's not "my good intentioned MTA".  It's an MTA you invented
by explicitly ignoring my statements about how MTA's should deal with
high-volume spam.  I hate it when people read my messages, and
conclude that I believe the exact opposite of what I've said.

  Can you please explain why you're arguing that I believe high-volume
to be a near-perfect indicator of spam?  I just can't understand how
you come to that decision.

Perhaps it also looked for other statistically telltale clues such
as an embedded URL and/or a perhaps the words '/special.*price/' and
combined with the sudden high volume it decides 'spam!'.

  If an MTA decides that a message is spam, that's it's perogative.

  In this (badly defined, hypothetical) case, I could agree that the
"spam" determination is probably not the bes thing to do.

My argument is simple. Mail Volumes can *in no way* be used to indicate
spam. it is simply flawed logic.

  By the same argument, the word "v**gr*" cannot be used to indicate
spam, because normal people use it in normal messages.

  The reality is different.  Keywords are strongly correlated with
"spamminess" of a message.  But they're not perfectly correlated.
Similarly, suddenly receiving a high volume of mail from a host which
usually sneds low-volume traffic is strongly correlated with spam from
zombied machines, or a new account buy a spammer, etc.  See the ASRG
archives for ISP admins discussion of this exact scenario.

Again to return to the Tattoo 'statistic'. Imagine a Judge in a
murder trial
telling the jury that statistically more murderers have
tattoos.  and as the
defendant has a tattoo he is more likely to be the murderer.

  The first statement may be correct.  The second does not follow from
the first.  You are arguing from flawed logic.  I tried to explain why
in my previous message, and it looks like I failed to communicate.

  Please read books by Marilyn Vos Savant on logic, or Jon Paulos
Allen.

  I suggest:

http://www.amazon.com/exec/obidos/tg/detail/-/0679726012/102-93211
11-1748922?v=glance
http://www.alibris.com/search/search.cfm?qwork=5268032&matches=21&qsort=r

The statistic may be absolutely correct,

  There is no statistic in your argument.  There is simply flawed
logic.  Correlation does not imply causation.

  Alan DeKok.

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg


_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg