ietf-asrg
[Top] [All Lists]

RE: [Asrg] 2a. Analysis - Spam filled with words

2003-09-12 12:05:55
|-----Original Message-----
|From: asrg-admin(_at_)ietf(_dot_)org [mailto:asrg-admin(_at_)ietf(_dot_)org] 
On 
|Behalf Of Kee Hinckley

<snip>

|The amount of work people are throwing at content-analysis stuns me. 

<snip>

| will continue to change (especially as spammers sell more and more 
| mainstream products).  People keep treating this as though it were a 
| technical problem that can be "solved" in some sense.  It's not. 
| It's an ongoing battle.  As such, you should pick those areas where 
| the enemy has the least flexibility and attack there.  In content 
| they have *infinite* flexibility.

Actually, the "message" of the spammer has some rigidity to it so it's
not without merit to attack there under this argument. The *infinite*
flexibility you speak of is not truly there - only apparently.

|Or to put it another way.
|
|Peolpe keep taniretg tihs as thoguh it wree a tichcaenl plreobm taht 
|can be "seolvd" in smoe ssene.  It's not.  It's an ogninog bttale. 

By the way... this obfuscation technique could easily be generalized and
captured by many content analysis systems...

|But this group is not about comparing techniques for blocking spam, 
|so I'm not sure this is a productive thread.  If you want to know how 
|I *would* (and do) block spam without looking at content, contact me 
|off line.

Agreed. However, as this group is about developing mechanisms for
consent, and since there is no universally applicable definition for
*spam*, discussions of content analysis and other mechanisms for
identifying spam are important parts of the activity of this research
group.

In order to define consent in a way that is executable we must define
the "to what" part of the equation. In practical terms, I frequently
have two users on the same system disagree about the definition of a
single message and that definition is often only resolvable by content
analysis... 

With the exception of the application of explicit consent tokens & C/R
mechanisms (which have their ups and downs) there is no clear way to
establish "what" a RECEIVER is giving CONSENT to receive. In the end,
the best practice will be for the RECEIVER to have any and all
mechanisms at their disposal for this purpose so that the greatest
diversity of needs can be met. Also, as with much of the Internet, once
the raw technology and mechanisms for this ability are in place the
costs of these tools will sink into obscurity as a tiny fraction of the
infrastructure - All of this is heightened right now because it's not
managed well - that will change one way or another... it is up to us to
decide how it will change so that we don't end up someplace else we
don't want to be. Simply getting to "someplace else" without deciding
"where" is like hitting the hyperspace key - you might end up in the
middle of a rock. (for the asteroids fans)

Spammers are being used to market mainstream products more and more
frequently (this is sad but inevitable),... and the same organizations
that send out ink, insurance, and travel spam are just as likely to send
you the IBM newsletter you signed up for or your latest RedHat notices.
Need I mention McAfee and Norton who seem to have an army of spammers
selling their wares... all without their permission of course (yeah
right). Or what about the amazing (cough) trend of anti-spam vendors
spamming to sell their software! (I hate those guys - we filter them and
we look like we're being unfair, we don't filter them and we get pounded
with customer complaints and submissions... and we're not allowed to use
this practice!! - we refuse!!!)

It's a zoo.

By all of that I'm actually agreeing with you that sorting out what is
"legitimate" from what is not is now and will continue to become
extremely difficult.

That said, if RECEIVERs are going to have control of what they receive
then content analysis will have to play a role in that mechanism -
simply because recipients, more often than not, define what they want
and don't want by the content of the message, not by the sender, not
even by agreements they may have with that sender (explicit or implied).

Filtering systems are now and will be as important as search engines for
precisely the same reasons... the amount and variability of information
that is "out there" or in the case of email on it's way to your mailbox
is *infinite*. Selecting what you want and ignoring what you do not want
is the critical task of the information age. 

Making that capability a practical reality requires technology - and
content analysis is part of that arsenal.

People keep thinking of this problem in terms of abuse, attacks and
illegal activities that should be stopped... much of that is true since
there are few controls in place... however the deeper "meaning" in all
of this mess is that once you can access anything you ever could want -
you then need a mechanism to manage that ability.

The information age is upon us. You can get anything you want
(practically) on the 'net.... and, as it turns out, it can get you too
(wither you want it or not) - if you don't have tools in place to
control that aspect of things.

Be careful what you wish for - you might get it. We all did.

</soapbox>
_M


_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg