ietf
[Top] [All Lists]

Re: going off-list, Re: proposal for built-in spam burden & email privacy protection

2004-02-13 09:02:54
On Thu, 12 Feb 2004, Dean Anderson wrote:

On Thu, 12 Feb 2004, Ed Gerck wrote:
Someday, however, users will want to stop using postcards for all 
their electronic conversations. At that time, at  zero added cost, 
we can easily introduce a mandatory per-message burden to spammers 
and make it backward compatible (so that we don't disrupt anything). 
The proposal points out that both goals (privacy and anti-spam) can 
be served not with signing but with encryption (even though, as an 
add on, signing may also help).

This is where the thinking is not clear: You can't add a cost to only
spammers.  Any cost increase in the cost of email would affect other large
email users (mailing lists, and such) at best equally to spammers, and
maybe not even equally.  Most spam is sent from infected computers, so the
spammer wouldn't pay anyway (whether it is money or computational power).

Further, any cost increase in email that is less than the cost of bulk
postal mail will not deter genuine spammers. But even the regular user
would feel the crunch if each email cost $0.37.  If the IETF had to pay
$0.37 per email, or even $0.15 per email, its 2 million/yr or so budget
would not cover its email costs, and your draft would not be published.

I totally agree with Dean.  I see no significant increase in costs to
the spammers, who use almost entirely automated tools in any event even
when one IS considering their local resource consumption.  This isn't
just a matter of opinion -- one can easily measure what the increase in
the worst-case "cost" to spammers will be.

Currently, email addresses are relatively "simple" objects and as such
are easy enough to remember (for humans) and communicate (for humans).
You propose to make an address a "complex" object: the simple address
plus kilobyte-sized blocks of text or binary data such as:

-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.0.7 (GNU/Linux)

mQGiBDprOxURBADW+ybmMpoBfbRADv3OzXaaIQVWGDoaNsR4uxstcJgo+FYMcO9B
Ag7eXRLfOQFPPS3211W/f1/GaQa70ZaPZaQV/A9POIGfRIbcfePHaIUUXYulO3Nh
Bqewb5JZjzkj0yZWdjzK/OSBEPITXMXyTDVnG0+Y4YHHLDOODnnkSOlZKwCg453O
...(much much more)

BOTH components will need to be included in a mail header presuming that
when one sends mail one WILL want one's correspondants to be able to
read and reply to your message.  BOTH components will have to be
published in various ways and given out if you want to continue to be
able to receive mail, be it from friends or strangers.

Automated grazers that collect the simple objects now can just as easily
collect the complex objects since they will of necessity both be
associated with any publiction of somebody's address.

On the other hand, this additional degree of complexity will be a
horrible burden in time, energy, and money to ordinary users of mail at
every level and especially to systems administrators responsible for
training those users and de facto responsible for managing the data
problem the composite "address" object would represent.  Communicating,
managing, and remembering email addresses like "joe(_at_)whereever(_dot_)edu" is
already "challenging" to many, if not most, users of email and is
obviously equally challenging at the corporate/professional level.  Let
us recall that most current AV software is still stupid enough to
consider the From address to actually be valid for viruses and SPAM and
cannot even manage to parse the real message header in sensible ways.
Now visualize these same bozos trying to deal with messages whose header
contains a 1-2K key:-/

Obviously, people will no longer be able to just say "here's my email
address".  Business cards will need to be roughly 4x6" and covered
mostly in small type.  Most users will simply rebel.  Most systems
people will simply rebel.  People CAN send encrypted, signed email now,
but how many do?  It's cumbersome and unnecessary for all but a certain
(small) class of sensitive traffic.

You are also suggesting that the encryption step, by being "expensive"
computationally, will at least slow down the spammers.  This assertion,
which has been made several times now, is simply not true.  Encryption,
especially RSA-type public/private key encryption, is NOT terribly
demanding in terms of numerical operations required per encrypted byte
-- this is one reason it is a popular and powerful algorithm.  Both SSL
and SSH routinely encrypt realtime traffic with little or no perceptible
delay relative to other sources of latency.  I use tools like rsync on
top of ssh to send entire compressed AND encrypted filesystem trees at
rates that (greatly) exceed a MB/second.  CPUs are powerful and cheap
and then there is Moore's Law.

Even with gpg used as a command line tool, it requires less than a
second to encrypt roughly a megabyte (possibly less -- the process is
likely not optimized for speed in the tools I'm using to crudely
measure).  That would be 500-2000 "typical" SPAM messages per second.  I
expect that this rate would easily ntuple if the relevant code were
extracted and written into an optimized binary, as gpg as a command line
tool has lots of per-invocation overhead that could be eliminated if
one's goal was simply to take a single message and a DB of keys and feed
an MTA queue.

Compare that as a bottleneck to the time required to send a thousand
email messages in the first place.  For each message there is a
nameservice hit to resolve the address, a protocol negotiation phase of
indeterminate length as the MTA connects to a remote agent and sends the
message on its way, and the local bandwidth bottleneck.  In nearly all
cases of stealth/virus driven SPAM, the per-message latency will likely
exceed the encryption time by a factor of two or more (quite possibly a
LOT more). A T1 link has a peak bottleneck of little more than a MB/sec,
for example, if you are pouring data through in an optimal stream.

Note that the encryption step is trivially parallelizable.  That means
that for a given MTA, EVEN if one is having a hard time keeping it busy
because of time spent encrypting (which I doubt), one can add one or
more cheap systems to the network to function as encryption slaves that
serve no purpose but to keep the MTA's outbound queue full.

I think that it is clear that your proposal won't have any significant
impact on SPAM generation, but it will have an enormous, and negative,
impact on ordinary mail users.

This is just a bad idea.  It is a VERY bad idea to push it as an actual
protocol requirement.  It is a bad idea to push it as best practice.  It
isn't even that great an idea to push it as a future design
recommendation.  It might be an "interesting" idea to discuss on an
appropriate list, and of course it is fair game for entrepreneurs and
developers with a vision and a perceived marketplace.

This is an issue that can be completely decided by those programmers
with a vision and the marketplace, not by a task force responsible for
developing protocols and open standards.  If one feels for any reason
that MTAs should be integrated with public key databases and
nameservices and key certificate agents extended to the level of
individuals, the protocols and open standard based tools all exist for
one to create software to support it (and indeed some individuals
already USE keys, signatures, and encryption for various purposes
including mail).  

So fine, invest the development effort to create the product and see who
buys it, or in the open source world, who adopts it and distributes it.
It is not appropriate for us to either mandate or suggest that ALL
developers "should" modify all their products to use a particular
layering of these tools as the default standard and force developers to
invest the resources required to do so at the very substantial risk that
no reward awaits them in the marketplace for having done it.

   rgb

-- 
Robert G. Brown                        http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     
email:rgb(_at_)phy(_dot_)duke(_dot_)edu