xsl-list
[Top] [All Lists]

Re: [xsl] Using node-set variables in predicates (another node comparison question)

2022-01-05 00:25:22
Hi Chris,

> I want to remove leading/trailing whitespace from certain DITA block elements. For example, I want to turn this:

I have been working with DITA and found that there is *no needs* to remove leading/trailing spaces when we publish it to PDF or HTML.

However one output format do needs removing whitespaces from the DITA input. It is Microsoft Word (.docx) output.

I have implemented this feature in the following codes:

https://github.com/AntennaHouse/ah-wml/blob/master/com.antennahouse.wml/xsl/dita2wml_convmerged3.xsl
https://github.com/AntennaHouse/ah-wml/blob/master/com.antennahouse.wml/xsl/dita2wml_text_map.xsl

What is your use case that needs removing leading/trailing whitespace?

Regards,

On 2022/01/03 10:58, Chris Papademetrious christopher(_dot_)papademetrious(_at_)synopsys(_dot_)com wrote:
Hi Dimitre,



Just some feedback from a novice... For me, this would be difficult to remember 
to determine if a node is in a sequence:



        exists(index-of($seq, $n, id-equal#2))



A one-word operator for this would be easier for me to remember:



        $n in $seq

        $n is $seq





Hi everyone (again),



I was able to use the [$n intersect $seq] trick again today! And I'm proud of 
how it turned out, so I wanted to share it with you.



I want to remove leading/trailing whitespace from certain DITA block elements. 
For example, I want to turn this:



        <p>   This is some text.</p>



into this:



        <p>This is some text.</p>



But there are two tricky aspects:



1. The leading/trailing whitespace could be buried in a lower-level inline 
element:



        <p>   Here is some text.</p>

        <p><b>   Here</b> is some text.</p>

        <p><b><i>   Here</i></b> is some text.</p>



so I need to match the first effectively rendered descendant text() node of 
these block elements.

        

2. Some DITA block elements allow other DITA block elements in them:



        <p>   This is a paragraph element.</p>

        <li>   This is a list element.</li>



        <li>

          <p>   This is a paragraph element in a list element.</p>

        </li>



so I need the sibling-adjacency check to stop at the lowest-level enclosing 
block element.



Here are the templates I came up with:





   <!-- look for leading/trailing text() nodes in these block elements -->

   <xsl:variable name="elements" 
select="//(desc|dt|entry|glossterm|li|p|pre|shortdesc|title)"/>



   <!-- remove leading whitespace from leading text() nodes in block elements 
-->

   <xsl:template match="text()

                        [matches(., '^\s+')]

                        [ancestor::*[. intersect $elements][not(descendant::*[. 
intersect $elements])]]

                        [not(ancestor-or-self::node()

                          [ancestor::*[. intersect 
$elements][not(descendant::*[. intersect $elements])]]

                          [preceding-sibling::node()]

                         )]">

     <xsl:variable name="results">

       <xsl:next-match/>  <!-- apply other templates, if needed -->

     </xsl:variable>

     <xsl:value-of select="replace($results, '^\s+', '')"/>

   </xsl:template>



   <!-- remove trailing whitespace from trailing text() nodes in block elements 
-->

   <xsl:template match="text()

                        [matches(., '\s+$')]

                        [ancestor::*[. intersect $elements][not(descendant::*[. 
intersect $elements])]]

                        [not(ancestor-or-self::node()

                          [ancestor::*[. intersect 
$elements][not(descendant::*[. intersect $elements])]]

                          [following-sibling::node()]

                         )]">

     <xsl:variable name="results">

       <xsl:next-match/>  <!-- apply other templates, if needed -->

     </xsl:variable>

     <xsl:value-of select="replace($results, '\s+$', '')"/>

   </xsl:template>





Basically, it goes something like:



Find the text() node:



* That has leading/trailing whitespace

* That is within a block element that does not contain some other lower-level 
block element

* That is not itself, or has no ancestor up to (but not including) that block 
element, with a preceding/following sibling



The hardest part was figuring out how to get all ancestors up to the first 
block element, but not past that. The nesting of [descendant::*[...]] within 
[ancestor::*] is probably not the most performant way to do this, but it gets 
the job done.



And by using <xsl:next-match/>, the templates can work together to remove both 
leading and trailing whitespace from the same text() node, if needed.



  - Chris







--
/*--------------------------------------------------
 Toshihiko Makita
 Development Group. Antenna House, Inc. Ina Branch
 E-Mailtmakita(_at_)antenna(_dot_)co(_dot_)jp
 8077-1 Horikita Minamiminowa Vil. Kamiina Co.
 Nagano Pref. 399-4511 Japan
 Tel +81-265-76-9300 Fax +81-265-78-1668
 Web site:
 http://www.antenna.co.jp/
 http://www.antennahouse.com/
 --------------------------------------------------*/
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>