Re: Which is better?

Jason Marshall asked,

| My question is, if I'm trying to match a lot of items (about 1900 of them,
| and growing), is it "better" to have one recipe for each of the 1900
| items, or one recipe with all items |'ed together? 
| 
| A small example...  Would it be "better" this way: (sorry about the long
| lines)
| 
| :0 h
| * ^(To:|From:|Reply-To:|Comments: 
Authenticated).*@(validreturn.com|.*\.validreturn.com)
| procmail/spamfile
| 
| :0 h
| * ^(To:|From:|Reply-To:|Comments: 
Authenticated).*@(audioforum.com|.*\.audioforum.com)
| procmail/spamfile
| 
| ...insert 1900+ more individual recipes here...
| 
| 
| Or would it be better this way:
| 
| :0 h
| * ^(To:|From:|Reply-To:|Comments: Authenticated)(_dot_)*(_at_)\
|    (validreturn.com|.*\.validreturn)|\
|    (audioforum.com|.*\.audioforum)|\
|        ...insert 1900+ more lines here..
| procmail/spamfile

It's better to combine as many as you can, within the limits of $LINEBUF. 

One thing that will cut the length almost in half is to change "@" at the
end of th first line to "@(.*\.)?" -- that way you can reduce all those

  (dom\.ain|.*\.dom\.ain)|

(not that you need the parentheses even now) to simply

  dom\.ain|

| Would this cause the indentation to be included in the ingredients for this
| recipe?  

No; indentation is ignored unless you stick a backslash or a pair of empty
parentheses in front of it (or parentheses around it).

| My kudos must go out to Stephen, who's written one kick-butt piece of
| software!

Absolutely.

If the recipe length does get unwieldy, you can break it up (more frequently
spamming domains in earlier recipes for efficiency).  Keeping a list of do-
mains in a separate file would end up running two outside processes on each
piece of suspected spam: formail to gather all those header lines into one
text to search and fgrep to check it against the blacklisted domains file.
[Well, the formail call could be avoided with multiple extraction recipes.]