RE: [Asrg] Re: Spam send/receive ratio


HOW is a receiving MTA going to know this?


  It sees the mail coming from the list, so any concerns about volume
are associated with the list, and not with the original submitter.


? perhaps confusion here over the meaning of "list" I am talking about a
mailing list of my customers
I am not talking about a list such as this ASRG list. on my mailing list I
am the submitter. I sent the mail.
But it makes no difference either way.


  You're misunderstanding statistics, correlation, causation,
overlapping sets, and inverse problems.  You're not alone.  These are
difficult problems for anyone.


No what I was saying is that using the statistics of mail volume to try and
pinpoint a spammer is a bogus test.
Statistics can and are misused all the time. Mail Volume is just such a
statistic.

My example was of my mailing list so I will continue...

Say I send out twenty emails a day. The general number of service emails to
my clients.

I develop a new piece of software so I decide to pen a special invitation to
my clients and tell them thay can get the software at a special invitation
only price of $x dollars. I include the URL of the order form for
conveniance.

Now your good intentioned MTA say "hey I got mail from xyz, I'll check his
volumes". and it comes up with "normal volume = 20, last hours volume =
5000". "Ooops" the MTA says "Spammer alert" and tosses my email. Perhaps it
also looked for other statistically telltale clues such as an embedded URL
and/or a perhaps the words '/special.*price/' and combined with the sudden
high volume it decides 'spam!'.

The MTA has made a bad decision which the recipient may not have
appreciated, especially when he eventually gets to my site and is forced to
pay full price because the mail didn't get thru.

My argument is simple. Mail Volumes can *in no way* be used to indicate
spam. it is simply flawed logic.

Again to return to the Tattoo 'statistic'. Imagine a Judge in a murder trial
telling the jury that statistically more murderers have tattoos.  and as the
defendant has a tattoo he is more likely to be the murderer.

The statistic may be absolutely correct, BUT the inference cannot be used in
the decision making process. to do so is seriously flawed.

A judge who said that could and should be disbarred for incompetance.

A simple turing test would stop the zombified machine automatically
responding a yes to the validation email.


  Please describe such a test that would work in practice.

  Here, "work" means not only be practical as a turing test, but also
be something that the end user would respond to.


The end user is not relevant to this discussion. the only purposes of this
test is the sender's ISP confirming that *HIS* client in fact intended to
send out such volumes of mail. remembering that we are talking about an
unusual volume.

If I was to write such a volume check it would go like this...

Stats would be kept of the average volume of all mail sent from all clients.

As a client starts sending mail, a counter would be incremented. after a
period of time this count would be tested against the average for that
client for that time period. should the average exceed previous averages
then the mail would start to spool.

After another period of time the counts would again be tested. IF the number
of sends has dropped, the mail would be sent and it would be assumed it was
a simple spike, perhaps birthday invitations.

IF the count continued to exceed expectations then spooling would be
continued. an e-mail or other form of contact would be dispatched to the
client.

This email would contain a simple turing test something like numbers
embedded into an image or perhaps a stronger test. the recipient would have
to respond with the correct answer. This is to prevent a zombified machine
from intercepting the validation email and responding to it.

If the response was correct the spooled mail would be sent.

This is not designed to prevent a spammer. it is designed to prevent
someones account being misused by a third party.

Chris

-----Original Message-----
From: asrg-admin(_at_)ietf(_dot_)org 
[mailto:asrg-admin(_at_)ietf(_dot_)org]On Behalf Of Alan
DeKok
Sent: Saturday, 22 May 2004 12:45 AM
To: asrg(_at_)ietf(_dot_)org
Subject: Re: [Asrg] Re: Spam send/receive ratio


"Chris" <asrg(_at_)rebel(_dot_)com(_dot_)au> wrote:

  The *list* re-sends the message to tens of thousands of recipients.
You don't.  Since each recipient has opted-in to the list, this isn't
a problem.


HOW is a receiving MTA going to know this?


  It sees the mail coming from the list, so any concerns about volume
are associated with the list, and not with the original submitter.

  That was your original point, I believe.  That you would be marked a
spammer because of the list exploding your traffic to tens of
thousands of people.  But it's not you, it's the list.

it just sees the volume discrepency and says 'Spammer'.


  Dumb MTA's may closely monitor the list, due to the volume of
traffic coming from it.  However, there are ways for the MTA to tell
that the traffic is NOT spam:

  - users at the MTA's domain send messages to the list
  - so the MTA communicates with the list's MX's
  - lists are long-lived, as opposed to spamvertized addresses
  - little spam comes from lists

Huh. So if I have a tattoo I am *more likely* to be a criminal then.


  No.  The relationship is the inverse of what you said, which is an
important distinction: Criminals are more likely to have tattoos than
the average person.

  The distinction is critical to a proper understanding of the problem.

most criminals have hair. should that be listed as a

distinguishing feature?

  Most non-criminals have hair, too.  Therefore it's not a
distinguishing feature of criminals.

  This shouldn't be rocket science.

Ever heard of white collar crime... thats the stuff that costs

billions of

dollars and ruins country economies. nary a tattoo in sight.


  You're misunderstanding statistics, correlation, causation,
overlapping sets, and inverse problems.  You're not alone.  These are
difficult problems for anyone.

  Criminals are more likely to have tattoos than the average person,
but not all criminals have tattoos, and not all tattooed people are
criminals.

I stick to very blinkered. with no apology.


  Then I'll apologize for pointing out the flaws in your reasoning.

No doubt volume is an indicator that SHOULD be used. but only at the
injection point not at the receiving point. the receiver has no way of
knowing that volume may mean spam.


  The receiver doesn't have visibility into the injection point.  The
receiver sees only what it receives.

  And large volumes of traffic from someone the reciever has never
communicated with before is a strong indicator that the traffic may be
spam.  At the minimum, the receiver should treat the traffic with more
suspicion that low-volume traffic, or traffic from well-known sources.

But the ISP who suddenly sees an upsurge in mail originating

from a client

could spool that mail and confirm with the client that such

volumes is in

fact intended.


  And the recipient can do the same thing.

A simple turing test would stop the zombified machine automatically
responding a yes to the validation email.


  Please describe such a test that would work in practice.

  Here, "work" means not only be practical as a turing test, but also
be something that the end user would respond to.

  Alan DeKok.

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg



_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg