ietf-asrg
[Top] [All Lists]

Re: [Asrg] Unique innovations made to anti-spam system

2006-01-24 03:19:58
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In article 
<cb84d2fe0601231942y293d98bejd7077b6c9c04df98(_at_)mail(_dot_)gmail(_dot_)com>
, Michael Kaplan <michaelkaplanasrg(_at_)gmail(_dot_)com> writes

   On 1/23/06, Richard Clayton <richard(_at_)highwayman(_dot_)com> wrote:
    
      I've never tried solving CAPTCHAs at speed, so I couldn't 
      predict how 
      fast I could do them for hours on end. But it looks to me that 
      the cost
      is definitely going to be in fractions of a cent/solution.
    
   Try solving a few of the Microsoft CAPTCHA.  An experienced person 
   should take about 3 seconds.  Working nonstop 12 hours a day would 
   get you 14,400 solved CAPTCHA.  

Whether it is 1 second or 3 is to some extent in the noise compared with
the other assumptions

I'll use my figure of 80 million 
   CAPTCHA solved in order to deliver one million spam.

hmm... I did try to explain that 4 million might be wiser :(

  That means 
   that every day the spammer is employing 5,556 workers using 5,556 
   computers that use electricity and may need air conditioning.  

http://laptop.media.mit.edu/

And 
   the third world owner of this business needs a cut and you'll need 
   security guards so that the computers won't get stolen and... 
    
   However you crunch the numbers this is a major expense.

I agree it is an expense. It's just that you think it is 20 x what I do.

Unfortunately for the scheme design, that 20 moves it into an area where
spammers could continue to operate efficiently.

For major disruption I'd like to see schemes where spammers had to
achieve savings of 100 or 1000 times what legitimate businesses had to.

Sadly, proof-of-work (however dressed up) does not have that property :(

Hence the only way to get it into the ballpark is to tack onto it some
sort of whitelisting scheme [or an equivalent blacklisting one]

The sub-addresses in the Kaplan scheme are whitelisting. However, I
don't think (hence my sums) that this proposal has a sufficient
multiplier effect to quite make it :(  [there are other issues as well,
but that's sufficient to kill it in my mind]

      Why does the filter suddenly improve when the email is sent for 
      the
      second time (viz: it starts to discard 95% of the email that it 
      approved
      earlier ?).  Or -- same idea but different: why does the spammer 
      send 
      something that is filterable at the first stage ?
    
      >>       Further, I'd dispute that applying two 95%-effective 
      spam
      >>       filters has
      >>       a net 99.75% success rate.
      >    
      >    Very well

      hmm... I think it needs more than that as a reply :( 
    
   During the harvesting phase the spammer must do what spammers never 
   do:  use a real and functional return address.  

they "never do" it because it isn't necessary in 2006

Once upon a time spammers did have return addresses ... which is why
"public.com" is nailed into codebases all over the planet :(

We can speculate 
   about how crippling this would be for the spammer.  

I'd prefer some figures based on analysis. I'd note that receiving 4
million emails a day is less than a rack of kit...  think of it as being
equivalent to handling the incoming email for an ISP with about 50K
customers -- so not trivial, but not rocket science either

I'll assume 
   that spammers will be forced to send poorly filterable material 
   during the first round but the incredible burden of using a real 
   return address may still allow for a degree of filtering. 

I don't see that -- the "real return address" will continue to function
just fine until it gets onto a blacklist. That will not happen until the
spam is sent -- which can be a long time after the sub-address was
handed out.

   So we will say that it is on the second round that real spam is 
   sent and that 95% of this will be filtered.  

I'm accepting your figure there. Over time I expect that to get worse
rather than better (as spam morphs to be more like real email) but at
the moment that's realistic.

Almost every commonly 
   used domain is trusted, but this spam is using a sub-address that 
   was sent to an untrusted domain; a stronger filter can be applied 
   to sub-addresses sent to untrusted domain. 

Unless that stronger filter is "drop all" then I don't accept that
somehow there are better filters :(  Leastwise not if they don't use
humans in the loop [which might be a better use of cheap labour than
solving CAPTCHAs -- the Good Guys can hire them to clean mailboxes]

   But also remember that it is very obvious which domains are sending 
   harvest spam.  

I don't see that at all -- you specifically make the point right at the
start of the explanation of the scheme that sub-addresses are entirely
transferrable. You even put "used by anyone" into italics to emphasise
this point :(

An ISACS utilizing email service provider may 
   normally get only 50 bounce generating emails a day from the little 
   known untrusted domain Sleazy.com.  Now over the last 30 minutes 
   100,000 bounce generating emails come in from Sleazy.com.

Spammers aren't that dumb -- the emails will be a wide range of
addresses...

For example sleazy-example(_at_)yahoo(_dot_)com, 
sleazy-example(_at_)msn(_dot_)com and so on.

If you're relying on Yahoo! and MSN to weed out Mr Sleazy (and I cannot
quite understand why you assume that) then the emails will come from
fred(_at_)sleazy1(_dot_)plausible(_dot_)com, 
bill(_at_)sleazy2(_dot_)plausible(_dot_)com etc

I don't accept the need for the spammer to set up all the sub-addresses
over "30 minutes" or to be consistent in their return addresses.

I also don't accept that it is easy to tell the difference between
plausible.com (apologies to the owner of that domain, but there is just
one) and uk.com (who sell example.uk.com domains to thousands of
distinct businesses) -- hence sub-domains will work well.

   Now the second round of spam comes in using real sub-address but 
   spoofed "From" fields.  The email service provider can reject and 
   send ISACS bounces to all of these extremely suspicious 
   sub-addresses if they do not use the Sleazy.com domain.  

You seem to be redesigning your system :(  Your webpage specifically
says (in fact it puts it into italics) "These addresses can be used by
anyone."  but you now seem to be associating sub-addresses with
particular sources of email.

That puts your scheme right into the horrible mess that is forwarding
and is therefore unwise.  I suggest you redesign it back again :(

Legitimate 
   correspondents usually would resend the bounce from the same domain 
   but ISACS usually allows them to use any domain.  Extra 
   restrictions can be placed on these extraordinarily suspicious 
   sub-address.  Or this extra-suspicious sub-addresses can just have 
   a ridiculously strong filter applied to them. 

I'm sorry, it's not possible to critique a system that is changing under
ones' feet (or one that uses mythical filters with mutable qualities).

Set out more clearly what "extra restrictions" are.

Set out more clearly how you deal with forwarding.

Set out more clearly why you believe no damage is done to legitimate
emails by "ridiculously strong filters"

   There are endless ways to play with the numbers, but I'll stick 
   with the estimate of 1.6 billion spam emails with real return 
   addresses sent in order to deliver one million spam (And I repeat 
   the question - Is this even possible?) 

The "Dutch botnet" discovered last Autumn is reported to have 1.6
million machines in it (off-the-record reports say that there were a lot
more). If each machine sends 1000 emails a day (which is a factor of 50
or so less than can be easily achieved) then you have the volume desired

So the answer to "is this even possible" is "regrettably, yes"

- -- 
richard                                                   Richard Clayton

Those who would give up essential Liberty, to purchase a little temporary 
Safety, deserve neither Liberty nor Safety. Benjamin Franklin 11 Nov 1755

-----BEGIN PGP SIGNATURE-----
Version: PGPsdk version 1.7.1

iQA/AwUBQ9X9NZoAxkTY1oPiEQKmbwCfVATRkpTGfGiwGoHfzEewidxyefwAoJMN
dUbjqgn1KF/Iwc6DnruLAQAf
=8ZaK
-----END PGP SIGNATURE-----

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg

<Prev in Thread] Current Thread [Next in Thread>