Dallman Ross wrote:
I am inclined to believe that some aspect of your
server environment acts differently when you are logged in from
how it acts when you are not. Maybe in one case procmail runs
under your uid and shell, but otherwise runs under suid root and
root's (postulated to be different) shell? Do you have a shell
definition line in your .procmailrc? I recommend "SHELL = /bin/sh".
Yes, I use the following, which has the same effect (at least on Solaris):
SHELL=/usr/bin/sh
In any case, now that I have seen some sample traffic reports and
received two directly myself upon having subscribed, I found that a
recursive SWITCHRC can work for this and give you abbreviated reports
that show you just what you want and not the cruft you don't want.
Sometimes, I like to look at the other cruft to remind myself why I pay
more to live close to work :-)
The HTML highlighting helps me do a rapid scan.
First, though, some precursor stuff. You had in your recipe,
LOCATIONS="(dumbarton|(east )?palo alto|stanford|menlo park|\\
redwood city|mountain view)"
As I mentioned in a previous reply, you could have a problem with
the word breaks. I see in the actual reports that once in a while
the spacing between words is inconsistent, which only corroborates
my earlier concern. I believe it would be easy to have, e.g.,
"PALO ALTO" (with two spaces between) show up instead of what
you were expecting, and you'd miss it. Also, a line end could
happen in the middle of the phrase. I recommended using only
one word (and had said that you don't, in any case, need the EAST
for EAST PALO ALTO, since you are accepting the second two words
anyway). If you don't want potential false hits with "REDWOOD"
or "MOUNTAIN", however, then here's another way. A tab and a
space are found inside of each of the two pairs of square brackets:
WRDBRK = ($[ ]*|[ ]+)
X = $WRDBRK
LOCATIONS =
"(Dumbarton|(East${X})?Palo${X}Alto|Stanford|Menlo${X}Park|\\
Redwood${X}City|Mountain${X}View)"
On Friday, I took some of your advice and arrived at two recipes that
perform the filtering correctly (more on that in a bit). Unfortunately,
I didn't get an answer to my original question.
For the locations variable, I now use this:
LOCATIONS="(palo( |^)alto|stanford|menlo( |^)park|\\
redwood( |^)city|mountain( |^)view|dumbarton)"
I don't remember seeing two consecutive spaces in these reports as in
"PALO ALTO", but I'll use your idea above when I get around to it. This
change can't hurt, but it doesn't explain the inconsistency in behavior
(for example, one road work incident was DUMBARTON, which is unaffected
by this change, and yet, it was not matched in production mode).
I think two words are necessary to avoid false hits on say, "Redwood
Highway, San Rafael".
This is the first recipe that works using scoring (at least it works in
procmail 3.15.2):
:0 B
* 1^0
* 1^1 $ (\<)road work(^.*($NSPC).*)?(^.*($NSPC).*)?(^.*($NSPC).*)?\
.*(\<)$LOCATIONS\>
* -1^1 $ (\<)$LOCATIONS\>
/dev/null
where NSPC = "[^ ]" because I don't want an empty line between the
road work line and a line with a location of interest.
The idea of the scoring recipe is that score = 1 + (number of road work
events in locations) - (all events in locations) is positive if and only
if the number of non-road-work events is zero, then the action is
executed as I don't want to see this report.
For reports like the one in the original posting with two road work
events in Menlo Park and one road-work event on Dumbarton and no
non-road-work events in locations of interest, this filter works with
procmail 3.15.2. However, it doesn't work on procmail 3.22 because in
the last condition, two occurrences of Dumbarton are counted even though
the report has only one occurrence. This is yet more weird behavior,
albeit in a different version of procmail.
The second recipe that works doesn't use scoring:
:0 B
* $ (\<)((problem|accident|slowdown|stall)(s)?|advisor(y|ies))\
(^.*($NSPC).*)?(^.*($NSPC).*)?(^.*($NSPC).*)?.*(\<)$LOCATIONS\>
{
KEEP=1
}
:0 E
/dev/null
This approach cheats in that it attempts to list all the complementary
events to road work (i.e. these are the events I want to see as opposed
to the ones I don't want to see). What I don't like about this recipe is
that some new classification could appear in the traffic reports (e.g.
"disaster" or "flood"), and this recipe would delete the report even
though I would want to see it.
All right, I used the above in my test harness, and it worked fine.
Here is the main recipe I put below that (goes in .procmailrc):
#-------------------------------------------------------------
:0
* ^From: KPIX\(_dot_)Traffic\(_dot_)Router(_at_)kpix\(_dot_)com
* ^Precedence: bulk
* $ B ?? ^\/\[ ()[0-9]:.*$(.+$)*(.*\<)?$LOCATIONS\>.*$(.+$)*.*
{ SWITCHRC = traffic }
#-------------------------------------------------------------
(I added the "Precedence:" check because you are /dev/nulling the
reports that don't have a city of interest in them, and I imagine that
the list administrator might write you some time with an announcement
that you'd otherwise miss. In my confirmation mail from the list
for signing up, for example, there was no Precedence: header.)
Okay, I'll add the Precedence test.
Now I made a separate rc-file called "traffic". That gets
run recursively. It's important to have a breaking occurrence
in a recursive rc; otherwise, it will iterate until your server
goes kablooey, or something. :-) I tested this one on two of
the actual 8-a.m. traffic reports from KPIX:
#-------------------------------------------------------------
:0 Dich:
* ! MATCH ?? ^^(.* )?ROAD +WORK$
| echo "$MATCH" >> somefile
:0
* $ B ?? ^$\MATCH(.*$)*\/\[
()[0-9]:.*$(.+$)*(.*\<)?$LOCATIONS\>.*$(.+$)*.*
{ SWITCHRC = $_ }
#-------------------------------------------------------------
(Heh. Note that there's no scoring. Not that I have anything against
scoring, but . . . I didn't need it.) That long condition might wrap
before it gets to the list, so I'll put a version here with a line
break:
* $ B ?? ^$\MATCH(.*$)*\/\[ ()[0-9]:.*$(.+$)*(.*\<)?\
$LOCATIONS\>.*$(.+$)*.*
I'll keep this one in mind for when I give up on scoring. Thanks for
your help.
Since I don't care to dive into the internals of procmail to find the
answer to my original question, I'll put it on the back burner for now.
Kevin
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail