procmail
[Top] [All Lists]

Re: Match operator and massive regex

2001-12-04 21:16:35
Martin McCarthy <marty(_at_)ancient-scotland(_dot_)co(_dot_)uk> writes:

I'm not entirely convinced that I understood the problem, so please do
clarify if I missed the point at all.

Looks like you got it right. I had a little trouble explaining it clearly.

Query summary: The regexp below matches in some unforseen ways. I want
to understand how this occurs. How the matching actually works
(internally) so I can adjust it to get the results I need.

Have you tried turning on VERBOSE logging so that procmail gives you
more details about what it is doing?

Yes, that was the first thing.  Or really I had verbose set right from
the start.  But it only shows no match ocurred, yet the Newsgroups
header shows a clear match.

[...]

========================================
## Match Newgroup names to one of newsrc list to form DELIVERY file name
  :0fh
* Newsgroups:[     
]*\/(comp\.editors|comp\.emacs|comp\.emacs|comp\.lang\.awk)
{
   DELIVERY=$MATCH
}

## If we have a bonafide `PATH: ' header then write it to DELIVERY
 :0
* ^Newsgroups:
* ^Path: 
 1x.${DELIVERY}.in
========================================

The first recipe sets DELIVERY to the newsgroup name *IF* one of the
given newsgroups is on the "Newsgroups:" line.  Unless you set DELIVERY
somewhere else, ${DELIVERY} will be empty in other cases.

There is the rub.  I didn't have a fall through group set to catch any that
didn't have a matching Newsgroups header.  However that has never
occured since I only download from the groups that do match.  And in
fact the messages that show up in `1x..in', do have matches as I
showed in previous post.
 
. . . . . . . . . . . . . . . . . . . . .  (As an aside
- did you mean to put "* ^Newsgroups:..." rather than "* Newsgroups:..."?)

Yes.

The second recipe can deliver to 1x.${DELIVERY}.in, even if the previous
recipe did not match.  In which case it will be delivering to 1x..in
because ${DELIVERY} will have no value.
Also, you probably want a lock file when you're delivering to the file.

I see where this would happen sooner or later, and it is something
that needs fixing.  But since the messages that land in `1x..in'  Do
contain matches.  I thinks something more is at work here.  However,
your suggestions cures the problem.

Perhaps you want something like:

:0
* ^Newsgroups:[     ]*\/(...etc...)
* ^Path:
{
   DELIVERY=$MATCH

  :0 :
  1x.${DELIVERY}.in
}

OK, cool.  Now I see how to do it all in one recipe.  And eliminate
the problem you raised above.  And this technique does work.  So it
was the missing ingredient.  

However I still don't really understand what was happening above. I
understand your argument, but like I said, those messages have
matching Newsgroup lines.

And further.  IF it was simply that some came thru that didn't match
then your technique would skip those.  So I added a catchall recipe.

    :0
   1x.misc_news.in

That should catch them.  Yet nothing shows up there.  (Using the same data)
So something else is happening but the technique you provided has
cured it anyway.  Could it have had something to do with locking?

Hope that's a help,

Yeah, a big help.  Now it works..
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>