xsl-list
[Top] [All Lists]

Re: [xsl] Removing unwanted space

2021-06-04 07:40:23
Thanks Wendell, Joel, and Graydon! I will use your suggestions and see what I 
get and whether I can apply the lessons to other places I need to get rid of 
white space.

I am at least a little gratified that this is not an easy problem causing the 
bumps on my forehead.

Joel, to answer your question (incompletely), given

<p>
    <anchor> </anchor>
    The rain in <bold> <underline> Spain </underline> </bold> <italic> is 
</italic> wet.
</p>

I'd likely want

<p><anchor> </anchor> The rain in <bold> <underline> Spain </underline> </bold> 
<italic> is </italic> wet.</p>

That is, remove the leading and trailing spaces caused by indentation, and 
assume every other space weirdness that occurs between the first non-whitespace 
character and the last non-whitespace character in <p> is correct. The tricky 
bit is the <anchor> element--space after or no space after?--which luckily is 
not analogous to a structure I will face in the paragraph case, but I may when 
I get to tables (yay!). In tables I fear that some line breaks will be junk and 
others used to get rendering they want, which will be near impossible to tease 
out. 



From: Wendell Piez wapiez(_at_)wendellpiez(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> 
Sent: Friday, June 4, 2021 7:36 AM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Removing unwanted space


Hey Charles, 

A couple of techniques I use in this situation:

text()[. is ancestor::p/descendant::text()[1]] -  matches the first text node 
in a p, no matter how deep.
text()[. is ancestor::p/descendant::text()[last()]] - same for the end

text()[not(matches(.,'\S')] - text that has no non-whitespace character

replace($str,'^\s*','') - strip *leading whitespace only* from a string.
replace($str,'\s*$','') - same for trailing whitespace

Et sim.

I am not sure I would use xsl:analyze-string here since as you observe it can 
be (um) pesky. I might do something as simple as

<xsl:template match=" text()[. is ancestor::p/descendant::text()[1]]">
  <xsl:value-of select=" replace($str,'^\s*','') "/>
</xsl:template>

But the match might have to be greedier if the inline markup is also deep, and 
this is only the front end.

This is not an easy problem since the (very smart) computer doesn't know the 
difference between "white space that matters" and "white space that doesn't 
matter". Indeed its whole notion of "white space" is somewhat problematic. So 
I'm not sure who's actually smarter. :-)

Cheers, Wendell

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--


<Prev in Thread] Current Thread [Next in Thread>