Re: [xsl] normalize-space() except ...

On 02/10/2015 12:24 AM, Liam R E Quin liam(_at_)w3(_dot_)org wrote:

On Mon, 9 Feb 2015 20:21:50 -0000
"dvint(_at_)dvint(_dot_)com" 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

<dd><p>
         This is my text
with <i>italics content</i> with other text.
</p></dd>

My output is coming out like this:
<ss:Data>This is my text with<ss:font italics="yes">italics
content</ss:font>.</ss:Data>


I'd probably do this in two steps -
(1) match text() and turn one or more whitespace characters into a space,
    probably using replace()
(2) strip leading space from the first text() in p, and trailing space from 
the last;


I do almost exactly this in several applications. I think it's fairly
common.

    watch for
    <p>The man wore<i> black </i>socks</p>
    which is not unlikely in XML made from word processing software.


Slightly more common would be <p>The man wore <i>black </i>socks</p>
where a double-click highlight in the WP software included the trailing
space on the word (someone just told me Word has just stopped doing
this: can anyone confirm?).

More pernicious is the erroneous elision of white-space-only nodes in
mixed content:

<p>The man wore <b>black socks<b> <i>only</i> on Tuesdays.</p>

resulting in The man wore black socksonly on Tuesdays. due to a faulty
xsl:strip-space (white-space-only nodes between subelements in mixed
content should probably never be removed, which is sometimes hard to
explain to people unaccustomed to document-class XML).

///Peter
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--