procmail
[Top] [All Lists]

Re: Regexp fails in scoring recipe

2003-05-16 12:55:39


Dallman Ross wrote:

Kevin Wu [mailto:tessar(_at_)bigfoot(_dot_)com] wrote:

Dallman Ross wrote:
NOT_AB = "(.?|(.*)${NWB}..|((.*)${WB})?([^a].|a[^b]))"

In any event, my thought is that $NOT_AB
should stay a clean definition, and the regex can be built around it to accommodate length of 0-infinity ${NWB} chars.


That would be great if it can be done.


Don't look now, but I think I may have solved it in a way
that leaves me satisfied.

It occurred to me while trying to sleep (often when I get good
ideas, but the computer has been turned off by then) :-p
that, since we are focusing on a rightward anchor, the line
end, we should develop our regex NOT_AB from the right, not
the left.  Woo-hoo, but that seemed like the key!  And I think
it is.

In the below, $WS is a space and a tab.  $NL is a newline.
I used a header test instead of body, and created a header called
X-AB-Check: for the testing.

--------------------------------------------------
NOT_AB = "(.|[^$WS]*([^b]|[^a]b|[^$WS]ab))"

:0
* $ ^X-AB-Check:((.*\<)?$NOT_AB)?$
{ LOG = "$NL NOT_AB $NL" }

:0 E
{ LOG = "$NL AB $NL" }
--------------------------------------------------

Eliminating the dots for character positioning is good. One quibble: the NOT_AB regexp still has an implicit assumption about a newline at the right end, but I'm doubtful that the assumption can be completely eliminated.

I'm satisfied the non-scoring recipe works now after stripping the carriage returns. I've gone back to the scoring recipe since it's more elegant and easier to maintain (e.g. what if I want to delete reports with only traffic advisories or road work events?).

Kevin



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail