On 2011-05-19 22:24, Imsieke, Gerrit, le-tex wrote:
On 2011-05-19 21:16, Julian Reschke wrote:
On 2011-05-19 20:51, Brandon Ibach wrote:
For 2), if you're using the regex to both validate the input (making
sure it conforms to the required syntax) and parse/extract the
name/value pairs, you might be able to make the job easier by breaking
these two tasks apart. Use the regex as you have it now to validate
the input and then, if it matches, use a shorter regex that matches
just a single name/value pair with analyze-string to do the actual
processing.
-Brandon :)
That's more or less what I do know. But as long as the regex contains a
repeating pattern, <xsl:matching-substring> will only be invoked once,
and the regex-group function will only return the contents for the last
match, right?
I think it depends on the implementation. I couldn't see anything in the
spec about what regex-group(3) of
([a-z]+)=([a-z]+)(;([a-z]+)=([a-z]+))*
should be. In Saxon, it's ';e=f' for your example, but in principle it
could also be ';c=d'.
As Brandon pointed out, using analyze-string with a repeating pattern
that matches the entire string is not the best approach. There are more
natural approaches that work without recursion. I sketched two of them
below.
..
Wow, thanks for the feedback.
What I did not mention in my mail is that I simplified things; first of
all tokenize() won't work, as the separator needs to take context into
account (the right hand side can be a quoted string which can contain
the ";").
Also, the syntax is slightly more complex; the first component differs
from the other components.
What I'm trying to parse is an HTTP header field syntax, shared by
header fields like Content-Type or Content-Disposition:
value = name ( ";" param )*
name = token
param = token "=" (token | quoted-string)
...
(in IETF ABNF speak).
The actual code I currently have and which works is in
http://greenbytes.de/tech/tc2231/tc2231.xslt
to be applied to
http://greenbytes.de/tech/tc2231/tc2231.xml
I currently have one template for matching the whole expression, which
delegates to another one for
( ";" param )*
which itself matches the first param, and then recurses. This probably
can be simplified as in your "as" example.
Thanks for the feedback, Julian
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--