xsl-list
[Top] [All Lists]

Re: [xsl] xslt 2.0 regex

2012-03-17 12:56:28
Using '|' for alternatives of single character patterns is
unnecessary. Omitting a few ranges and the anchor '$',
you can write, for instance:
<xsl:variable name="NameStartChar.re" as="xs:string">
[A-Z:_a-z&#xC0;-&#xD6;&#xD8;-&#xF6;&#xF8;-&#x2FF;&#x370;-&#x37D;.....]
</xsl:variable>

It reduces the danger of failing due to operator precedence. Then the
final RE becomes

    ^[...][...]*$

-W

On 17 March 2012 18:38, Brandon Ibach 
<brandon(_dot_)ibach(_at_)single-sourcing(_dot_)com> wrote:

Both XSLT 1.0 and 2.0 define variable names as QNames, so they *can*
have a ":" in them.

Ignoring that for the moment, though, Tony pointed out one consequence
of this, but the bigger issue is that the "|" operator in regex is
fairly low precedence, so you often need some parenthesis around the
list of alternatives to get things right.  Your NameStartChar.re has a
hex-char-ref-encoded "$" at the beginning, so that regex is actually
"$[A-Z] | _ | [a-z] | ...", which means it will match "(a dollar sign
followed by an upper-case English letter) or (an underscore) or (a
lower-case English letter) or ...".

Actually, it might not even match that, since the "$" is a special
character in regex, so you should escape it with a backslash to match
it literally (though I think there are some rules which allow it to
match literally even without the backslash, depending on what follows
it, but best to be explicit).

All that said, I got this to work by dropping the "$" from the start
of NameStartChar.re and changing Name.re to:

concat("\$(", $NameStartChar.re, ")(", $NameChar.re,")*")

-Brandon :)


On Sat, Mar 17, 2012 at 12:43 PM, davep 
<davep(_at_)dpawson(_dot_)co(_dot_)uk> wrote:
On 17/03/12 16:29, Tony Graham wrote:

On Sat, March 17, 2012 4:14 pm, davep wrote:
...

It's still not working

<xsl:variable name="NameStartChar.re" as="xs:string">
  &#x024;[A-Z]|_|[a-z] |
  [&#xC0;-&#xD6;] | [&#xD8;-&#xF6;] |
  [&#xF8;-&#x2FF;] | [&#x370;-&#x37D;] |
  [&#x37F;-&#x1FFF;] | [&#x200C;-&#x200D;] |
  [&#x2070;-&#x218F;] | [&#x2C00;-&#x2FEF;] |
  [&#x3001;-&#xD7FF;] | [&#xF900;-&#xFDCF;] |
  [&#xFDF0;-&#xFFFD;] | [&#x10000;-&#xEFFFF;]
</xsl:variable>

<xsl:variable name="NameChar.re"  as="xs:string"
             select="concat($NameStartChar.re,' |
- | \. | [0-9] |&#xB7; | [&#x0300;-&#x036F;] |
[&#x203F;-&#x2040;]')"/>


<xsl:variable name='Name.re'
             select='concat($NameStartChar.re,
"(", $NameChar.re,")*")'/>


Why not use '\i' and '\c' from
http://www.w3.org/TR/xmlschema-2/#charcter-classes?


For which range please Tony? Err....

\i includes : which is wrong?
\c looks good though! Ah no. Again it's NameChar from
http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-NameChar
which is more than allowed for xsl:variable @name?



Otherwise, you may want '(' and ')' around $NameStartChar.re in
$Name.re,
otherwise (to mix variable expansions) it looks like
'...|[&#x203F;-&#x2040;]($NameChar.re)*" and you'll only match
multi-character names when they begin with a character in the range
[&#x203F;-&#x2040;].


As I read it (or more accurately fail to read it correctly)
It's NameChar less :
followed by (Name less :)+

Simpler version [A-Za-z0-9]+ and the i18N additions,
but I can't get the simpler one working.





It's only matching on the first letter of a variable currently....

<xsl:variable select="$fred"/>

Produces
"xsl:variable"
           [f]
"xsl:variable"
           [r]
 "xsl:variable"
           [e]
 "xsl:variable"
           [d]

so something is seriously wrong.




Regards,


Tony Graham                                   tgraham(_at_)mentea(_dot_)net
Consultant                                 http://www.mentea.net
Mentea       13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
 --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
    XML, XSL-FO and XSLT consulting, training and programming


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or 
e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--





regards

--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--