Matching Word Boundaries with \< and \>, and Extended Character Sets

Hello Folks,

        It has been quite a long while since I last posted to this
mailing list.

        I think that I have found a deficiency in the Word Boundary
match operators, \< and \>, of procmail v3.11pre7.  What to do with,
for example ISO-Latin-1 characters at word boundaries?  They are not
considered as being in the same class as normal alphabetic letters.

For example,

:0 B
* \<lance\>

will match against the French word "élance".

Can the match algorithm be corrected to allow extended character sets?
Of course, "\<" could be replaced by a big "[...]" enumerating the
special non-alphabetical characters, but this may be more costly in
terms of execution-speed.

Happy Easter,

        --Ralph

Dr. Ralph P. Sobek                Disclaimer: The above ruminations are my own.
Ralph(_dot_)Sobek(_at_)irit(_dot_)fr                       Addresses are 
ordered by importance.
sobek(_at_)irit(_dot_)fr                                                If all 
else fails, try:
newsmaster(_at_)irit(_dot_)fr, postmaster(_at_)irit(_dot_)fr             
sobek(_at_)diva(_dot_)eecs(_dot_)berkeley(_dot_)edu
Ph:(+33)[0]561558618  FAX:(+33)[0]561556258  http://www.irit.fr/~Ralph.Sobek/
===============================================================================
Urgent!! Greenhouse Effect: http://www.irit.fr/~Ralph.Sobek/greenhouse.html

Previous by Date:	Re: procmail question, Philip Guenther
Next by Date:	getting last match into $MATCH, David W. Tamkin
Previous by Thread:	Re: procmail question, Philip Guenther
Next by Thread:	Re: Matching Word Boundaries with \< and \>, and Extended Character Sets, Philip Guenther
Indexes:	[Date] [Thread] [Top] [All Lists]