procmail
[Top] [All Lists]

Re: Match operator and massive regex

2001-12-05 10:07:21
Martin McCarthy <marty(_at_)ancient-scotland(_dot_)co(_dot_)uk> writes:

I didn't have a fall through group set to catch any that didn't have a
matching Newsgroups header.  However that has never occured since I
only download from the groups that do match.  And in fact the messages
that show up in `1x..in', do have matches as I showed in previous
post.

*Now* I get it.  I hadn't properly understood earlier - I thought that
you only wanted to match the first listed newsgroup, but you want to
match the frist matching newsgroup, which might not be the first listed
one.  I misread your explanation.

The 'Newsgroup:' condition as it stands will match only on the first
newsgroup in a list - that is, 'Newsgroup:' followed by whitespace,
followed by a newsgroup name.
But the newsgroup you want might be later in the list since the list
might consist of:

Haa, so that was the screw up.  Hadn't occured to me that the
whitespace was anchoring matches to the first group listed.  No regex
expert here...

Thanks for the mini turorial...


  Newsgroups: rec.fog.plaiting,gnu.emacs.help

and you want the 'gnu.emacs.help'.

A suitable recipe might be:

 :0
 * ^Newsgroups:(.*,)?[     ]*\/(...etc...)
 * ^Path:
 {
    DELIVERY=$MATCH

   :0 :
   1x.${DELIVERY}.in
 }

The extra '(.*,)?' will permit the condition to skip over any
non-matching newsgroups on the Newsgroups line, but will stop at the
first matching one it finds.  Does that make sense to you?

Yes.  But it would be nice if this kind of discussion were in the
procmail man pages.  Maybe it is and I'm missing it.  I don't expect
the kind of details like in this thread but for example:
The fact that perl-like non-greedy operators work.  Or that [     ]
means something special.

I don't think that is a normal regex operator.  And searching 3 of the
procmail man pages for \[.*\] finds many hits with something between
the brackes but none like is actually used in the recipe.

What is the explanation of that symbol ([    ]) anyway?.  I knew to
use it from seeing it used in examples on this group but one wouldn't
learn about it in `man proc*'.  At least not easily. 

Just looking at the example above, and not having tested or
experimented yet, it appears that:
   * ^Newsgroups:.*[     ]*\/(...etc...)
Or maybe:
   * ^Newsgroups:(.*)[     ]*\/(...etc...)

Should work as well.  Or would that only matchs the last group listed?
You can tell that I don't really understand the role played by 
`[     ]'.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>