xsl-list
[Top] [All Lists]

RE: [xsl] lookaheads in XSLT2 regexes

2010-03-01 11:52:41

I didn't realise we were missing \b -- we should add it, if 
that's the case.


I think it was omitted deliberately, on the grounds that it's
locale-sensitive. It's defined in Perl as matching "a spot between two
characters that has a \w on one side of it and a \W on the other side of it
(in either order)", where \w matches a "word" character (defined as
"alphanumeric" plus "_"), in which "the list of alphabetic characters
generated by \w is taken from the current locale". That's not an acceptable
definition for our purposes, so it's arguably better to have no definition
at all.

We could perhaps define \w to match "alphanumeric" as the term is used in
xsl:number (categories Nd, Nl, No, Lu, Ll, Lt, Lm or Lo) and then it's a
well-defined concept, though not necessarily one that matches user
expectations.

The fact that Perl overloads \b to mean backspace when within a character
class doesn't help.

And one feels that if it's useful to have a metacharacter that matches the
spot between a character in one character class and a character in its
complement, then one ought to generalize the concept so it works with any
character class, not just the rather arbitrary class containing Nd, Nl, No,
Lu, Ll, Lt, Lm and Lo.

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay  


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--