On 7/10/07, Andrew Welch <andrew(_dot_)j(_dot_)welch(_at_)gmail(_dot_)com>
wrote:
On 7/10/07, Michael Kay <mike(_at_)saxonica(_dot_)com> wrote:
> Haven't worked out the detail, but it seems to me that if you add a trailing
> comma at the end of the string, you can then do
>
> <xsl:analyze-string select="concat($in, ',')" regex='("[^"]*"|[^,]*),'>
> <xsl:matching-substring>
> <token><xsl:value-of select="regex-group(1)"/></token>
> </xsl:matching-substring>
> </xsl:analyze-string>
Hmm, seems to work.
> Doesn't strip the quotes off, but that part's easy.
It is, especially as Abel wrote it for me :)
I'll try it out and then write it up, thanks!
I had to modify it to cope with nested quotes, such as "foo, ""bar"""
- this is what I came up with:
<xsl:function name="fn:getTokens" as="xs:string+">
<xsl:param name="str" as="xs:string"/>
<xsl:analyze-string select="concat($str, ',')" regex='(("[^"]*")+|[^,]*),'>
<xsl:matching-substring>
<xsl:sequence select='replace(regex-group(1), "^""|""$|("")""", "$1")'/>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:function>
I think its a neat use of regex-group to capture both sides of the
pipe (quoted and unquoted values) but not the trailing comma. Any
comments welcome.
I've posted the complete transform here:
http://andrewjwelch.com/code/xslt/csv/csv-to-xml_v2.html
cheers
andrew
--
http://andrewjwelch.com
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--