xsl-list
[Top] [All Lists]

Re: [xsl] For-each adds whitespace per iteration: why?

2014-01-10 14:56:05
I think too I was still laboring under the XSLT 1 definition of
<xsl:value-of/>. Taking the time to re-read the definition of value-of in
XSLT 2 I see that it explicitly generates text nodes. I don+IBk-t think I ever
realized that before.

This changes everything (at least in the way I approach constructing
result text).

Cheers,

Eliot
-- 
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 512.554.9368
www.reallysi.com
www.rsuitecms.com




On 1/10/14, 12:59 PM, "Eliot Kimber" <ekimber@rsicms.com> wrote:

Ah, that explains it. I have gotten in the habit of preferring
<xsl:sequence> over <xsl:value-of> but this is apparently one place where
I should not have.

A subtle aspect of the spec.

I+IBk-m glad my understanding of +IBw-concatenation+IB0- in this context was
incorrect. 

The key bit from 5.7.1 appears to step 3 of the sequence processing rules:

"3. Any consecutive sequence of strings within the result sequence is
converted to a single text node, whose string value
<http://www.w3.org/TR/xslt20/#dt-string-value> contains the content of
each of the strings in turn, with a single space (#x20) used as a
separator between successive strings."

That makes things clear.

Cheers,

Eliot


-- 
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 512.554.9368
www.reallysi.com
www.rsuitecms.com




On 1/10/14, 11:27 AM, "Michael Kay" <mike@saxonica.com> wrote:


On 10 Jan 2014, at 17:05, Eliot Kimber <ekimber@rsicms.com> wrote:

In the context of writing an XSLT to generate DTD syntax from RNGs (for
DITA 1.3) I discovered that for-each results in whitespace being
emitted
for each iteration. This came as a surprise. Reading the spec it says,
under clause 7, Repetition:

"For each item in the input sequence, evaluating the sequence
constructor
<http://www.w3.org/TR/xslt20/#dt-sequence-constructor> produces a
sequence
of items (see 5.7 Sequence Constructors
<http://www.w3.org/TR/xslt20/#sequence-constructors>). These output
sequences are concatenated; ...

I understand "These output sequences are concatenated+IB0- to mean that
string
concatenation rules are applied, which explains the white space.

No, this is a concatenation of two or more sequences to produce a single
sequence. No whitespace is added at this point.


My question: why is for-each defined in this way?

It isn't.


I tested this with this little XSLT transform:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
 xmlns:xs="http://www.w3.org/2001/XMLSchema";
 xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl";
 exclude-result-prefixes="xs xd"
 version="2.0">

 <xsl:output method="text"/>

 <xsl:template name="test-for-each">
   <xsl:variable name="strings" select="('one', 'two', 'three',
'four')"/>
value-of +ACQ-strings=<xsl:value-of select="+ACQ-strings"/>
for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str, '/')=<xsl:sequence
select="for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str, '/')"/>
string-join(+ACQ-strings, '')=<xsl:sequence select="string-join(+ACQ-strings,
'')"/>
for-each over strings: "<xsl:for-each select="+ACQ-strings">
 <xsl:sequence select="concat('/',.,'/')"/>
</xsl:for-each>"
 </xsl:template>

</xsl:stylesheet>



Which produces this output using Saxon 9.5.1.2:

value-of +ACQ-strings=one two three four
for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str, '/')=/one/ /two/ /three/
/four/
string-join(+ACQ-strings, '')=onetwothreefour
for-each over strings: "/one/ /two/ /three/ /four/"

The whitespace is being added as part of the process of constructing your
final result tree from a sequence of strings. The result tree is
constructed as a document node, following the rules of 5.7.1 Constructing
Complex Content

http://www.w3.org/TR/2009/PER-xslt20-20090421/#constructing-complex-conte
n
t

or equivalently the rules applied by the Serializer

http://www.w3.org/TR/xslt-xquery-serialization/#serdm

The simplest way to avoid the space separation is to construct text nodes
rather than strings, which happens if you replace xsl;sequence by
xsl:value-of in

<xsl:sequence select="concat('/',.,'/')"/>

Michael Kay
Saxonica


I see that the for-each result is consistent with the flowr expression.

Is my analysis correct that the only way to construct a string with no
extra whitespace using a loop is to use string-join() as in my test
case?

For my DTD-generation application that would mean using the for-each
loop
to construct a sequence of strings and then using string-join on the
sequence to avoid additional whitespace. Of course I can simply account
for the space inserted by the concatenation and get the correct
indention
and keep my code a bit simpler.

Cheers,

Eliot

-- 
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 512.554.9368
www.reallysi.com
www.rsuitecms.com



--+//0-------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe@lists.mulberrytech.com>
--+//0---



--+AH4-------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe@lists.mulberrytech.com>
--+AH4---



--+AH4-------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe@lists.mulberrytech.com>
--+AH4---



--+AH4-------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+AH4---