procmail
[Top] [All Lists]

RE: recipe question

2002-04-11 14:26:24


Im new to procmail.  I saw a recipe for putting spam sites
into a single (or multiple) files to avoid spam.  It was
something like this:

:0
* ? (formail -x "From:" -x "Sender:" \
      -x "Reply-To:" -x "Return-Path:" -x "To:" \
      | egrep -is -f $HOME/addresses/spamlist.txt)
$MAILDIR/spam

Now I might have something wrong here because when I look at my
log file I get the following:
[...]
egrep: regular expression too long
[...]
Im trying to figure out why egrep reports expression too long.
My file seems reasonable.  Its only 32 lines long.

When I tried things like this in the past, I found  that some implementations 
of 'egrep' will choke
when fed a long list of strings (typically in a text file as you have them). 
Internally, the
implemnetation builds tables used to drive the pattern matching algorithm. 
These tables can grow in
a combinatorial fashion, depending upon the complexity and number of input 
strings. You must also
be careful to escape all pattern sequences unless you explicitly want those 
pattern matches neabled
(your example appeared to escape the patterns in an approriate manner).

Instead, you might want to try 'fgrep' (or 'grep -F' if you're using GNU grep. 
From the GNU grep
documentation:

`-F'
`--fixed-strings'
     Interpret pattern as a list of fixed strings, separated by
     newlines, any of which is to be matched.

if you use fgrep, the quoting the "magic" characters is unecessary. Ideally, 
the algorithm used by
fgrep is faster and the required storage is less than required by grep or egrep.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>