On 01.03.2013 11:40, Michael Kay wrote:
(b) they wanted to exclude anything that didn't make sense in an
international Unicode context (so things like word boundaries were
immediately suspect)
If they had been concerned about what is a word constituent and what is
not in a certain language, they wouldn’t have included \w and \W in
http://www.w3.org/TR/xmlschema-2/#cces
\w is locale-independently defined as:
[#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}] (all characters except the set of
"punctuation", "separator" and "other" characters)
So I think \b, defined as either a \w-\W, a \W-\w boundary, the start
anchor, or the end anchor, is also perfectly well-defined and works as
expected in most circumstances.
Gerrit
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--