ietf-asrg
[Top] [All Lists]

Re: [Asrg] 1c. Analysis of Spam - biological approach

2004-02-19 12:01:50
[Note: I am not a biologist by training, but a CS person who has an interest in it.]

Eugene Crosser Wrote:

A. parasites have negligible or positive effect on the host's health.
 (in this case, they are called symbionts).
> B. parasites have negative impact on the host's health (the host
>  suffers from a desease).

These points somewhat depend on what one is viewing as the 'host'. That
is, are we thinking of the host as the entire mail system, or an end user?

(A) From the end-user point of view, the relationship can be thought of
as loosely symbiotic (but the effect is neither negligible nor positive
for the host). If one takes an economic view, a spammer _needs_ some
people to respond to their messages whether it means visiting a web site
or making a purchase.

(B) The general state of the mail system today. Unhappy end users, mail
providers requiring additional infrastructure, etc.

B.1. The host gets active defense against parasites and eliminates them (gets well himself, or thanks to medical treatment). Parasites
 die.

The end result of an ideal spam-proof system (or working legal
solution). In context of current solutions, this direction is perhaps
when messages are dropped at the MTA by filtering or host blacklist
matching.

When the response is inaccurate we have unwanted side effects. In a
biological sense, this might be when the immune system makes a
recognition error attacking its own cells or a treatment affects more
than the targeted cells.

B.2. The host gets passive defense against parasites and do not suffer anymore, while the parasites stay alive (scenario turns to 'A').

Perhaps closest to a widespread application of effective filtering
solutions at the ISP level where spams are marked but not
destroyed by the ISP or always filtered out by the end users. The
burden caused by the parasite is still there, but masked.

B.3. The host continues to suffer but stays alive, feeding parasytes (chronic disease).

Same as B.?

B.4. The host eventually dies, bringing parasites to death as well (terminal desease).

Our worst case scenario.

End user: Where they find the signal to noise ratio too high rendering
e-mail unusable.

Broader sense: When the amount of spam far outweighs that of legitimate
e-mail such that ISPs or end users en-masse abandon e-mail altogether
for something else.

One may lead to the other. For example consider Usenet. Relatively few
people use it, but the cost of running the service is very high in terms
of bandwidth and storage. To this end some ISPs no longer offer
access or use an outside company to supply the feed. If few people used
e-mail anymore due to spam, would ISPs still offer it? (I realize the two systems are vastly different in requirements so this probably wouldn't happen due to spam anyway).

Anyone with a degree in bioscience here? Or willing to bring in someone knowledgeble? Any previous works in this area?

I do not think this view has been taken before with spam (in terms of a
research paper at least).  Though it's a very interesting idea.

The general concept bringing in ideas from biological systems (in particular, immune systems) and computing has been drawn before.

In systems security, Stephanie Forrest has done a fair amount of work on
this (http://www.cs.unm.edu/~forrest/). Her paper "A sense of self for
Unix processes" (Under the 'Computer security...' publication section) is a good starter.

An expansion on your general view of the relationship between biology,
us, and spammers might be:

Legitimate e-mail: Our 'good' cells
Spam: Our 'bad' cells
Host:
   End users, ISPs, MTAs, Internet itself (?).
Immune system:
   Anti-spam software (new protocols too), aggressive laws/lawyers.
Energy source: Money direct or indirect (web site hits)
Parasites:
   a) Spammers, "marketing" organizations, etc.
   b) Symbiotes:
      i) Bimodal: open relays ('good' as a mail server for the
          authorized users, bad for others).
     ii) Spammers: they need us for energy
Viruses:
   Compromised systems sending out spam by proxy, or compromising
   other systems.

[Below is some expanded biology stuff, feel free to correct me as it
 has been a few years since I've studied the immune system.]

In biology terms, the immune system can recognize 'bad' from 'good' by
recognizing specific protein sequences (binding sites). Generally the cells can recognize multiple sequences. Hence, vaccines to train the immune system. Mutations occur during replication via various 'errors' in the reproductive process. Mutant cells are similar, but the binding sites may be different such that the immune system cannot recognize the 'bad' cell.

A spam filtering parallel might be hashes or 'loose' hashes or spams
that can recognize a family of messages. Our mutations are the
randomized messages that try to break hashes or skew word frequency
statistics.

A lower level view that might also be useful is to draw parallels
between spam and single stranded DNA (ssDNA).  Consider the sequence
of 'good' cells versus 'bad' cells. Broadly there are coding and
non-coding regions in ssDNA. What parts of e-mail could be described as
such? Other relevant areas: transcription (tRNA), etc.

This is getting really long so I'll stop now =)

-Greg

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg



<Prev in Thread] Current Thread [Next in Thread>