procmail
[Top] [All Lists]

Re: crud in subject line - spam trap

2002-11-03 05:56:50
Udi Mottelo <uuddii(_at_)eng(_dot_)tau(_dot_)ac(_dot_)il> writes:

(Sample subject lines)

  Subject: [1$0m]A&2H8 0#:4;g @Z0]Au 0x0m...>H3;9...
  Subject: (1$0m)<x0#(_at_)G <1EC(_at_)L @N;}@; AB?lGQ4Y!!!
  Subject: ':N:N0! GT22 :80m Aq1b4B 193; CVCJ <:@N;g(_at_)LF(_dot_)'
  Subject: [1$0m]<v(_at_)T@/>F?kG0, @O:;<v(_at_)T18A& AA@:>F0!?J 180fGO<<?d!
  Subject: Hi Professor, Ultra-Thin Si Inventory 30um & 50um thin 2"-6" in 
stock...
  Subject: VP9zWn4s5DMxBgSNO7Mf<RLl5X
  Subject: Dates For All 1681HdVP8-2-10

Can someone point me to URLs or etc that discuss using procmail for
finding a percentage of unreasonable stuff in subject line.

      Did you try looking in the Content-Type: and/or charset= ?

      Do you mean that you get multiple Subject: fields?  Or you just
      show us a collection of examples?

      Maybe:

:0 H
* ^Subject:.*^Subject:
/say/it/is/a/spam


I posted a variety of the type of subject lines in which I see a high
number of unreasonable characters.  At least unreasonalbe in the
sense they are not likely to appear in a legitimate message to me.

Even numbers are unusual in personal mail subject lines.  At least
above a certain percentage of total chars in subject line.

What I was looking for is a method to analyze the subject line to
determine if there are more than a certain threshold of unusual
characters.

The first example above shows something like 39 chars.  Of those, 
some 13-14 seem somewhat unusal for subject line:

      [$]&#;@]...>;   
It represents something like 35-40 percent of the total.

I'm intereresed in a script called by a recipe that makes such an
analysis.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail