procmail
[Top] [All Lists]

Re: whole word recipe

2002-05-27 19:59:22
Steve Semple asked,

| The ? I find a bit confusing. You know even when I read regular
| expression descriptions over and over it seems even more cryptic.

It means "zero or one of the preceding" or, if you like, "[whatever] or
nothing."  (.*\<)? means "either nothing or a string that ends in a non-word
character."

| Im confused what the difference between a space and
| a inter-word space is, they sound the same to me.

Well, the man page doesn't differentiate between a space and an inter-word
space; it says, as you quoted,

Since they match actual characters, they are only suitable
to delimit words, not to delimit inter-word space.

The distinction is between delimiting words and delimiting the space between
words.

In egrep and perl, \< matches the *transition* from a character that wouldn't
be in a word (such as a space or a punctuation mark) to a character that would
be in a word (such as a letter or a digit), and \> matches the *transition*
from word to non-word.  In procmail, they match a non-word character; there
has to be a character there to match it (a punctuation mark, space, tab, or
newline).  So if you have

 hi there

as the text, and

 hi\>.*\<there

as the pattern, you'd get a match in egrep or perl but not in procmail, while

 hi\>(.*\<)?there

would match under either interpretation.  On the other hand, in procmail you
could test for two adjacent punctuation marks with this:

 ()\>\<

while that would make no sense in egrep or perl.

So what's that stuff about delimiting words but not delimiting inter-word
space?  It's that

 ()\<word\>

works in procmail as well as in perl or egrep, but in egrep or perl you use

 \>{some pattern of spaces or punctuation marks}\<

to look for a place where two words were separated by a match to that pattern,
but it wouldn't work in procmail because you'd need an additional non-word
character on each end to match \> and \<.




_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>