procmail
[Top] [All Lists]

RE: rule to catch a certain number of characters

2007-05-26 15:55:07
wolfgang schrieb on Saturday, May 26, 2007 10:48 PM:


Dallman,

thanks for taking that challenge ...

 I sent you a sample of those mails off list, for reference ...

Yup, I got it.

[Dallman wrote:]

I have an idea.  Can you imagine a legitimate message with
even a 1,000-char title string without whitespace?  I
can't.  So why not trash at that level instead of looking for
100,000?

Agreed (actually, I can't think of legitimate <title>1,000
chars</title> messages at all, so maybe we don't even have to
stick to non-space characters there?).

Okay, and that will save some env space.

  SPACE = ' '
  TAB   = ' '
  WS    = $SPACE$TAB

Why $SPACE$TAB? Doesn't that mean a space followed by a tab?

Because we want it for "[^$WS]", which is the char-class of
"not white space".

  :0:
  * ^Content-Type:.*/html
  *   B ?? > 100000
  * $ B ?? $xWS1152.*$*.*<\title>
  spampile

Is <\title> a typo for </title>?

Yes, sorry.

I tested this on your sample, and it worked after I
raised LINEBUF to 8K:

=========================================================
 11:52pm [~/Mail] 700[0]> cat 100k.rc       
 

  LINEBUF =  8192
  SPACE = ' '
  TAB   = '     '
  WS    = $SPACE$TAB

  xWS8    = [^$WS][^$WS][^$WS][^$WS][^$WS][^$WS][^$WS][^$WS]
  xWS64   = $xWS8$xWS8$xWS8$xWS8$xWS8$xWS8$xWS8$xWS8 xWS8 #unset   8
  xWS384  = $xWS64$xWS64$xWS64$xWS64$xWS64$xWS64    xWS64 #unset  64

  :0:
  * ^Content-Type:.*/html
  * > 100000
  * $ B ?? ()<title>^?$xWS384$xWS384$xWS384
  spampile
=========================================================

If we just use any char instead of non-whitespace, it's more
compact.  In fact, the below works fine and doesn't need
even a bump above the default $2K for LINEBUF.

=========================================================

  c64 = '................................\
         ................................'

  :0:
  * ^Content-Type:.*/html
  *   B ?? > 100000
  * $ B ?? ()<title>^?\
             $c64$c64$c64$c64$c64$c64$c64$c64\
             $c64$c64$c64$c64$c64$c64$c64$c64
  spampile

=========================================================

I suppose it should be renamed "1k.rc" instead of "100k.rc",
though. :-)

Dallman


____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail