Re: Proposal to define a simple architecture to differentiate legitimat


IMO, this (whether Hotmail will implement a specific feature) is a fairly 
irrelevant (an 80 out of 80/20 rule) fork of the debate relative to the main 
point of the proposal, so let's try to wrap this fork up with one or two go 
rounds max okay.

Interestingly note that Hotmail makes you pay to POP *FROM* hotmail, but no
charge to POP from other accounts *TO* Hotmail.  Does that give you any hint
about their business model??


Yes.  It's *NOT* a business model where they want to be polling a dozen servers
on a regular basis for each of their customers for mail that may or may not be
there, and for the average mailing list, probably is not there at any given
poll.



Not any more than they want to be POPing any email at all.  Nobody wants to do 
any work they do not have to do.  But if there is advantage for them, or a 
profit to be made, they do it.  If they do not, someone else will, then they 
lose marketshare.

They used to not POP email at all (do you remember or did you know that?).  
Then they discovered they were missing a big market of eyeballs.

 They want eyeballs, and the last thing they want to do is expend more
effort than needed to get eyeballs.



No disrespect intended, but that sentence is illogical.  You want something and 
you don't want to do something that gives you what you want.

You mean I guess that they would not agree to add effort just to retain 
existing eyeballs.  Again I disagree.  I think they will do what ever they have 
to in order to retain market share, as long as the cost doesn't kill more 
profit than it retains.

Sure - they can even optimize the 'POP the
list' check by only doing it once for all the subscribers - but they're still
hitting each server for each list on a several-times a day basis.  And under
the current scheme, they can just *catch* one SMTP transaction with all the
RCPT TO's piggybacked *when there's actual mail*.  So they'd have to work a lot
harder under your scheme.



POPing once (one list mailing) versus processing one email with zillion RCPT 
TOs (one list mailing) is not a very big cost difference.  One might be 
slightly less than the other and we really can't say which one, but it is 
irrelevant because the difference is insignificant.

Actually it is more likely that when they POP they will get several messages at 
once, so less cost than catch several SMTP emails.  

Also they know a priori the correlation of receivers to POP, which can be 
optimized with time, versus having to build a new mapping table in real-time 
every time they process an SMTP with RCPT TO.

And let's *THINK* for a moment here - what is your proposal *REALLY* going to
change?  We already have many estimates that 50% or so of all e-mail is spam.
Let's take that as a given, and let's make the rash assumption that the rest is
25% mailing list traffic and 25% person-to-person.



It be more interesting to know what the real stats are on the other 50%, 
because I doubt that 25% is legitimate bulk email.  It seems that you live in a 
different (mailing list centric) world than I and most "normal" people live in. 
 I join mailing lists for a short time to get something done, then I leave 
asap.  Most of the people I know and the many thousands of customers I come 
into contact with, seem to not even know how to use a mailing list.

With 500 million people on the internet, I would venture that 80% don't even 
know what a mailing list is.  They may use Yahoo Personals, and not even 
realize it is a mailing list.  Since the email is being directly deposited into 
the Yahoo account, they have no clue.

Any way, let's follow your line of debate...

So what you want to do is take the 25% of the list traffic that works just fine
on the current infrastructure,


No it doesn't work fine.  My gf complained that she couldn't find her Yahoo 
Personals email amongst the 500 spams she gets per day, of course that makes me 
happy but that is besides the point :)

and is usually quite easily whitelistable via a
number of different methods -



Whitelisting can be subverted by spammers:

http://www.cnn.com/2003/TECH/internet/09/01/spam.chainletter/index.html

"...Herrick, however, admits that the practice could be a good way to bypass 
e-mail filters which block messages from senders who are not known to the 
recipient. Spammers could use chain letters to discover the addresses of people 
with whom you frequently communicate. Spam purporting to be from someone in 
your address book would sneak by filters. 
"If I were a spammer, I'd be working very hard to perfect this technique," he 
said..."

and move it to something totally different.
And what you're left with is a 2-1 mix of spam and personal mail that you
yourself admit things like the DCC and spam filters are unable to perfectly
distinguish.


The whole point of the change is to enable elimination of the spam which can 
not currently be done.

See my response to John C Klensin, regarding "chicken and egg" and the example 
benefits to attacking spam:

http://www1.ietf.org/mail-archive/ietf/Current/msg22050.html

Having exiled the mailing list traffic,  we would then be able to work on
separating the spam from non-spam - but as you already noted yourself, we don't
know how to do that yet.



Yes we probably do.  Just because the DCC can not measure bulk email reliably 
doesn't mean Hosts, ISPs, and other software can not.  BrightMail already is 
(just signup for an Earthlink account and try really hard to get some spam), 
and I will also be probably be demonstrating something soon.

 And getting rid of the mailing list traffic doesn't
in fact gain us anything at all, since everybody who filters list traffic into
separate folders for each list knows that isn't the problem - it's the
unfiltered stuff that's left in the inbox.



You are missing the point, which is until you can say that all bulk email is 
spam, then you can't target spam.  How could ISPs, Hosts, legislators, and 
judges know the difference between legitimate email and spam?  Again see the 
targeting benefits:

http://www1.ietf.org/mail-archive/ietf/Current/msg22050.html

Worse as it stands now, mailing list traffic can often get misidentified as 
spam, unless it is a well established list.

I'll note in passing that the two highest SpamAssassin scores I've ever seen
were both on legitimate postings to mailing lists -  both were humor pieces
about spam....


I've already written publicly in 2002 that the Bayesian and any content 
filtering methods cause more harm than they solve.

Quite frankly, given that at least half the spam I get is already in obvious
violation of at least one law (pick one - securities fraud, advance-fee scams,
wire fraud, bogus pharmeceuticals, or hijacking a proxy to send the mail), I
severely doubt that anything the IETF does in regards to standards won't make a
difference. The spammers often don't even bother following RFC822 - why should
they follow your scheme?


Again you are missing the whole point.

It is has nothing to do with what spammers will or will not do.  It has to do 
with what Hosts, ISPs, etc are currently prevented from doing.  Since they can 
not determine what is spam, they can not enforce any law.  The practicalities 
of blocking email based on a wide range of hard to prove laws is none.  There 
would be too much liability for the enforcer if they do not successfully win 
the criminal case.  Whereas if you have a simple, clear cut metric as in my 
proposal, then ISPs, Hosts, etc can take action and will take action because 
spam is one of their major costs.

The *only* two ways to get rid of spam both involve making it non-profitable.

The first is lowering the generated income.



I agreed.  I have written a thesis on this entitled, "Fragile (yes I think so!) 
Economics of Spam"

 Given that recently, somebody
hacked the site of a "fertilizer for your body part" scam, and found a list of
6,000 people who had paid $50 a bottle, I have to sadly conclude that Korbluth
and Barnum were both correct, there's one born every minute and the rate is
increasing.  So there's no joy to be found there.



I've read the theories and you realize that the spammer's margins are very 
thin.  It won't take too much to topple the boat.  The problem is that the 
architecture is not adequate to increase their costs significantly yet.  That 
is why I made this proposal.

The second is raising the cost to the spammer.


Agreed 100%.   See above.

 Personally, I like the idea of
taking up a collection among the ISPs and other providers, and hiring some good
ethnic muscle (there's competition in the field, a number of experienced and
ruthless groups are available).  I'm sure the spam problem would change
drastically if the spammer was seriously having to balance the mentioned $300K
for bogus enhancement pills against having their kneecaps broken by one group
or worse by one of the other groups...



That is not an effective deterent as evident by the drug war and crime in 
general.  You actually have to make it more expensive to send than to recieve.  
That is what my proposal is all about.


Pity that will never work though.  At least not officially (although one 
infamous
New Zealander apparently retired recently...)

:-)

Re: Proposal to define a simple architecture to differentiate legitimate bulk email from Spam (UBE)