xsl-list
[Top] [All Lists]

Re: [xsl] unparsed-text and normalize-space when parsing CSV files

2014-12-05 13:51:44
What about using:

            tokenize($csv, '\r\n|\r|\n')[not(position()=last() and .='')]


Cheers,
Dimitre

On Fri, Dec 5, 2014 at 11:36 AM, Hank Ratzesberger xml(_at_)xmlwerks(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Hi,

I ran into a strange issue where I was running transforms on a Windows
platform, but under Cygwin. I was trying to parse a csv file.

The problem was that I was defining a variable for the newline, which
I expected would match the native system:

<xsl:variable name="nl">
    <xsl:text>
</xsl:text>
</xsl:variable>

and then parse the file like this:

<xsl:variable name="lines" select="tokenize($csv, $nl)" as="xs:string+" />

but it turns out that this does not really solve the issue of
mixed-source line endings since one or the other could have been
edited on a different file system. So I think this is a common issue
of parsing these kinds of files.

I was able to rely on normalize-space() to remove an extra CR, but
that function could make unwanted changes to other content.

Anyone recommend a safe way for this?

Thank you,
Hank


--
Hank Ratzesberger
XMLWerks.com

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>