Re: A tool for refining regex

At 19:58 2002-01-30 -0800, Harry Putnam wrote:

> Some false hits are most easily avoided by greenlisting some lists or
> posters.

Greenlisting?

PC version of "whitelisting". Oppostite function of RBL, is people youtrust, or lists which have closed subscriptions, so aren't sources of spew.

Full logging is no problem here.  Its a single user setup and I only
get some 300 messages daily.  Thanks for the tip about pulling it in
from INCLUDERC though.

Perhaps then you might archive off the logs? Nightly incremental gzippingor somesuch...

I actually have a woking system something like that but more
primitive.  I have a test area setup and skeleton .procmailrc that
sets test area maildir, orgmail, default, test area logging and other
defaults.

I call it a sandbox. I redefine some things such as $SENDMAIL too, so asto take the bite out of "!" and message creation functions.

That last part is where the rub is.  I want to let procmail do most of
it by showing what was hit...exactly.  I will then be able to set the
regex accordingly or insert a new recipe as needed.


Have you actually TRIED what I suggested yet?

I've never seen a problem from not using a LOCKING file here or really
in 90 percent of my recipes.. out of some 30, or so I have one
locking. Most have been in use for at least months.

Just wait 'till you have two messages arrive at about the same time. Thenyou'll have some grief. LOCK - to you, it's just an extra colon on theflags line...

I'm not saying its smart or right only that I haven't seen a problem I
recognized to be caused by not locking.  What would such a problem
look like?


Corrupted mailbox.  No fun.

Concerning the host escaping:  I haven't seen a false hit I tracked to
being cause by that... probably sloppy alright but it seemed much more
important in the host numbers part.

Get in the habit of escaping dots where you expect them to bedots. Eventually, it'll bite you in the ass if you don't escape a dot on abroad regexp.

> With the scoring, I'm busting your combined expression out into
> separate ones - that allows you to see the individual lines which

I would have thought that would case a whole different action since
then both must match.

Not with scoring. Unscored tests MUST evaluate true (which is why I say tomove them to the top - if they fail, the scoring won't occur), but scoredones need only _total_ a positive value. If you don't use scoring for itsmore advanced purposes, it at least allows you to perform OR expressionseasily.

But apparently the odd looking notation `1^1' means something I have yetto learn about.


'man procmailsc'

1 is the base value for a match, ^1 is an "exponent" that says for eachADDITIONAL match, multiply the base match by this. If you just wanted ANYmatch, you'd use ^0, but if we use a nonzero for the exponent, procmailwill continue looking for matches - and as a result of the \/ match in theregexp, we'll get the matching line emited to the verbose log for eachheader that matches...

If I had the line that matched, I think it would be fairly easy to
tell what did it even with all those or things.

I'd suggest you TRY what I recommend, within a sandbox, and see first-handhow it operates.

This is all starting to sound very complicated... Not to be a
ne'r-do-well slacker but I had the idea this could be done in a much
more lazy way by letting procmail show the way.

Hello, what about the simple expression I showed you seemscomplicated? JUST TRY IT ALREADY.

[snip - try the suggestion that was offered, since it WORKS and doesn'trequire re-engineering of the universe]


---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail