xsl-list
[Top] [All Lists]

Re: [xsl] Fwd: text nodes

2008-09-18 16:00:04
Hello Wendell,

first of all, thank you for your response.

Second:

THANK YOU FOR YOUR RESPONSE! =)

you read my mind. Everything is working great right now!

Regards,

Lucas.

On Thu, Sep 18, 2008 at 11:26 AM, Wendell Piez 
<wapiez(_at_)mulberrytech(_dot_)com> wrote:
Lucas,

The problem you are looking at is actually a variant of a grouping problem.
Processing all the nodes up to a particular node amounts to grouping the
nodes into several "before" and "after" groups.

Grouping in general, and this sort of grouping in particular (called
"positional grouping") are a well-known weak spot in XSLT 1.0. Accordingly,
if you can use XSLT 2.0, you will have much better and much easier solutions
available.

If you must use XSLT 1.0, however, there are known methods. The two best
methods are probably sibling recursion and key-based association. In sibling
recursion, basically what you do is shift your processor (using template
modes for this) out of its normal pattern of selecting and processing
(applying templates) all children, and instead process only the first child,
which processes the next, which processes the next, etc. This gives you a
way to introduce stop and restart conditions into the processing.

In key-based association, you basically associate the nodes, typically using
a key (this makes it easier), with the node you want to stop on, and then
use the key to retrieve them. This is essentially an optimization of the
method that Sam has suggested (his logic does the same thing without the
key).

I think this method may be slightly easier for you. It would look something
like:

(Sam's code, for comparison)

<xsl:template match="a">
 <xsl:variable name="next_a"
   select="generate-id(following-sibling::a[1])"/>

 <xsl:for-each select="following-sibling::text()[ generate-id(
   following-sibling::a[1] ) = $next_a ]">
   <xsl:value-of select="."/>
 </xsl:for-each>

</xsl:template>

<xsl:template match="some_element">
 <xsl:apply-templates select="a"/>
</xsl:template>

<xsl:key name="nodes-by-last-stop" match="node()"
 use="generate-id(preceding-sibling::a|parent::*)[last()])"/>
<!-- using this key, each node in the document can be retrieved using
    the system-generated ID of the last preceding sibling 'a'
    element, or of its parent (for those that have no
    preceding-sibling::a) -->

<xsl:template match="a">
 <!-- when 'a' is matched, nothing is done with it, but the elements
      associated by the key with its generated ID are processed -->
 <xsl:apply-templates select="key('nodes-by-last-stop',generate-id())"/>
</xsl:template>

<xsl:template match="some_element">
 <!-- when an element requiring splitting is matched, its own associated
      elements are processed (these are children that have no 'a' preceding
      them), then its 'a' children are processed -->
 <xsl:apply-templates select="key('nodes-by-last-stop',generate-id()"/>
 <xsl:apply-templates select="a"/>
</xsl:template>

If you research how keys work, you will find this does exactly the same
thing as Sam's logic, only more concisely and more comprehensively (since it
doesn't drop elements before the first 'a').

This can be extended to include 'lb' elements among the "stop" elements as
follows:

<xsl:key name="nodes-by-last-stop" match="node()"
 
use="generate-id(parent::*|preceding-sibling::a|preceding-sibling::lb)[last()])"/>

<xsl:template match="a | lb">
 <xsl:apply-templates select="key('nodes-by-last-stop',generate-id())"/>
</xsl:template>

<xsl:template match="some_element">
 <xsl:apply-templates select="key('nodes-by-last-stop',generate-id()"/>
 <xsl:apply-templates select="a | lb"/>
</xsl:template>

Note: this code is untested, although the algorithm isn't.

Good luck (and find a way to use XSLT 2.0!),
Wendell

At 07:35 AM 9/18/2008, you wrote:

Thank you Sam!

i meant by ' stop on the first occurrence on the "[ ]" ' this:

for this input:

<some_element>
<a>hello</a> some text 1<br/>
some text 2
some text 3
<a>hello 2</a> some text 4
some text 5
some text 6
</some_element>

i want this output:

"some text 1 some text 2 some text 3"  and
"some text 4 some text 5 some text 6"

I am not sure how to deal with <br/> elements. I'm using xsl to
extract data from HTML.
i can't make the xsl you sent me to work. If this info helps you to
help me ... let me know.

best regards!

L.

On Wed, Sep 17, 2008 at 10:52 AM, Sam Byland 
<shbyland(_at_)verizon(_dot_)net> wrote:
the output is:

"some text 1 some text 2 some text 3"

Lucas,

I used:

<some_element>
<a>hello</a> some text 1<br/>
some text 2
some text 3
<a>hello 2</a>
</some_element>

for the input.  Assuming you only want to output the text up to the next
<a>
element, then something like this (XSLT1.0) might get you in the right
direction:

<xsl:template match="a">

  <xsl:variable name="next_a"
select="generate-id(following-sibling::a[1])"/>

  <xsl:for-each select="following-sibling::text()[ generate-id(
following-sibling::a[1] ) = $next_a ]">
      <xsl:value-of select="."/>
  </xsl:for-each>

</xsl:template>

<xsl:template match="some_element">
  <xsl:apply-templates select="a"/>
</xsl:template>

I'm not totally sure what you meant by ' stop on the first occurrence on
the
"[ ]" '

...sam



======================================================================
Wendell Piez                            
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
 Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--





-- 
Ing. Lucas Lain

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>