On 9 May 2016 at 18:17, Costello, Roger L. costello(_at_)mitre(_dot_)org
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Hi Folks,
I am writing an XSLT program to read unparsed-text. I want my program to work
regardless of the end-of-line marker used by the unparsed-text.
Here are the end-of-line markers that are typically used in files, I think:
CR + LF
LF
CR
Below is code that I wrote to determine the end-of-line marker used in the
unparsed-text. Is there a better way to determine the end-of-line marker?
Note: $file is a variable that contains the unparsed-text.
<xsl:variable name="end-of-line">
<xsl:choose>
<!-- CR + LF (decimal 13 followed by decimal 10) -->
<xsl:when test="contains($file, codepoints-to-string((13,
10)))">
<xsl:value-of select="codepoints-to-string((13,
10))"/>
</xsl:when>
<!-- LF (decimal 10) -->
<xsl:when test="contains($file, codepoints-to-string(10))">
<xsl:value-of select="codepoints-to-string(10)"/>
</xsl:when>
<!-- CR (decimal 13) -->
<xsl:when test="contains($file, codepoints-to-string(13))">
<xsl:value-of select="codepoints-to-string(13)"/>
</xsl:when>
<!-- Perhaps the input file consists of just one line and
there is no end-of-line marker!
What would be an appropriate value in this
situation? An error? -->
<xsl:otherwise>
<xsl:value-of
select="error(QName('http://example.com/', 'EOL-err'), 'No end-of-line
symbol')"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
--~----------------------------------------------------------------
I think it's usually better to just write the code so that it works
with any EOL marker, any such test as you suggest above will get
confused with the (not uncommon) case of files with inconsistent line
endings for example.
For example if you have a text file file.txt then
<xsl:param name="filelinest"
select="tokenize(unparsed-text('file.txt'),'[\r\n]+')"/>
will give you a sequence of strings where any combination of U+000A or
U+000D counts as a separator.
(If you need to consider \r\n\rn as a blank line you need to be a
slightly different regexp such as \r|\n|\r\n
the version above intentionally drops blank lines)
xpath 3 has a provided function that does more or less exactly this
(probably more efficiently than doing a regexp over the whole file
contents)
https://www.w3.org/TR/xpath-functions-30/#func-unparsed-text-lines
David
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--