ietf-asrg
[Top] [All Lists]

[Asrg] 2. - Spam Characterization - Possible Measurements (was : RE: Two ways to look at spam)

2003-07-02 14:54:55

Here is a list of characteristics that I'd put together. They are grouped by
sending, source, message, and spam attack characteristics. I've also added
the three suggested by Barry. What others should we consider?

Sending Characteristics: 
Forged email addresses
Forged HELO domains
Forged RCVD hdr
Lacking reverse DNS entry

Source characteristics: 
Dial-up accounts
Geographic location
Relays 
Proxies 
Zombie machines 
Free email services
Time-domain analysis (suggested by Barry)

Message characteristics: 
Call for action
        -URL
        -email addr
        -phone num
        -physical addr
         Stability of web addresses etc advertised in spam (suggested by
Barry)
Dates
        -past
        -future
Size of message
Attachments
Image usage (web bugs and remote images)
MIME type
X headers
HTML usage
        -specifc tricks used

Spam attack characteristics: 
How long does a flood last
How many recipients
The above two based on different approaches such as relays vs proxies etc
(based on Barry's #3)
How distributed are the recipients
How many servers used for the same flood
What tools are used
How long does it take a new account to be spammed
Address harvesting characteristics
Dictionary attack characteristics

All of the above characteristics can also be viewed over time.

-----Original Message-----
From: Barry Shein [mailto:bzs(_at_)world(_dot_)std(_dot_)com] 
Sent: Wednesday, July 02, 2003 3:45 PM
To: Yakov Shafranovich
Cc: Barry Shein; 'asrg(_at_)ietf(_dot_)org'
Subject: RE: [Asrg] Two ways to look at spam



Well, some ideas:

1. Some sort of time-domain analysis of where spam actually comes from
   (ip addresses, nets.)

If it's seemingly random that would point towards the theory 
that it's just (presumably illegally) exploited machines.

If it's coming from specific places with some predictability 
then that would lean towards more consent-based conclusions.

2. Stability of web addresses etc advertised in spam.

I've heard it claimed (by one of the speakers at the MIT spam 
conf) that the typical lifespan of a spamvertised website is 
two hours.

Again, that sort of instability tends to promote the idea of 
spam being a product of criminal behavior.

3. Stability of relays

Similar, but how long does a spam relay spew spam, typically 
(what's the distribution)? One hour? 12 hours? Years? And 
related summary statistics such as the number of msgs spewed, 
the time domain (is it bursty or continuous), etc.

       -b

On July 1, 2003 at 17:30 research(_at_)solidmatrix(_dot_)com (Yakov 
Shafranovich) wrote:  > At 03:30 PM 7/1/2003 -0400, Barry 
Shein wrote:  > 
 > 
 > >  At 03:35 PM 6/29/2003 -0400, Paul Judge wrote:
 > >  > >Just as in any other business, the profit in 
spamming is equal to 
 > > revenues
 > >  > >minus costs. In spamming, revenue is equal to the 
number of spam messages  > >  > >received times the response 
rate times the profit per item. Expenses 
 > > include
 > >I will point out that the hard evidence for this is 
lacking.  > >  > >[..]  > >More to the point I would assert 
that if we don't endeavor to nail  > >down hard evidence and 
work forward from there we're in great danger  > >of 
shadow-boxing with our own imaginings about how we would like 
to  > >think spammers operate.  > >  > >I realize the urge to 
show progress is great and fact-gathering sounds  > >like a 
frustrating impediment to some, but...how bad would it be if  
our efforts turned out to be foolish and disconnected from 
reality,  > >research into a June bug*?  > 
 > Great, what kind of evidence or things should we be 
looking for? From 
 > (http://www.irtf.org/asrg/asrg-work-items.txt):
 > 
 > ---snip---
 > 2.a. Spam Measurements. This works needs to be focused on 
immediately. This 
 > data will help us understand the current weaknesses in the 
system and where 
 > efforts should be focused. Requirements need to be set and 
then we have to 
 > gather the data. I see two separate paths here: One is 
based on user survey 
 > input. Ted Gavin has volunteered to conduct this. The 
other data is based 
 > on real spam measurements. Once the requirements are 
gathered, Brightmail, 
 > CipherTrust, CloudMark and MessageLabs have each 
volunteered to contribute 
 > information. Any other volunteers?
 > ---snip--
 > 
 > As you can see Brightmail, CipherTrust and a bunch of 
others agreed to 
 > provide data. All we need is to define what we are looking for.  > 
 > Yakov 

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg


_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg



<Prev in Thread] Current Thread [Next in Thread>