On 2/21/07, Houghton,Andrew <houghtoa(_at_)oclc(_dot_)org> wrote:
> From: Andrew Welch [mailto:andrew(_dot_)j(_dot_)welch(_at_)gmail(_dot_)com]
> Sent: 21 February, 2007 14:07
> To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
> Subject: [xsl] tokenizing comma separated string with quotes
>
> Given the input
>
> <elem>"foo, bar", baz, bom</elem>
>
> Is there a nice one liner / technique to return the three
> tokens "foo, bar" "baz" "bom"
>
> eg:
>
> <root>
> <token>foo, bar</token>
> <token>baz</token>
> <token>bom</token>
> </root>
>
> I can't see the answer for all the apparent quote escaping required...
If you are using XSL 2.0, this should work:
Input file:
<elem>"foo, bar", baz, bom</elem>
Transform file:
<xsl:transform version="2.0"
exclude-result-prefixes="xsd xsi xsl"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output method="xml" version="1.0"
media-type="text/xml" encoding="utf-8"
omit-xml-declaration="yes" indent="yes"
/>
<xsl:variable name="regex">
<xsl:text>\s*("[^"]*"|[^,]+)\s*</xsl:text>
</xsl:variable>
<xsl:template match="/">
<root>
<xsl:analyze-string regex="{$regex}" select="/elem">
<xsl:matching-substring>
<token><xsl:value-of select="regex-group(1)"/></token>
</xsl:matching-substring>
</xsl:analyze-string>
</root>
</xsl:template>
</xsl:transform>
Output file:
<root>
<token>"foo, bar"</token>
<token>baz</token>
<token>bom</token>
</root>
Thanks Andrew - is there a modification to the regex to not include
the surrounding quotes?
eg
<token>"foo, bar"</token>
should be
<token>foo, bar</token>
I can translate away the quotes, but it would be nice if the regex
could be modified to do the same thing.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--