xsl-list
[Top] [All Lists]

RE: XSL pattern needed for begin/end elements

2004-07-07 14:42:40
Hi tracy,

I haven't tried it with variations in your input XML but you may want to use
a identity template approach, like this:

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:xlink="http://www.w3.org/1999/xlink";
exclude-result-prefixes="xlink"
version="1.0">

<xsl:output indent="yes"/>

<xsl:template match="/doc">
   <xsl:copy>
     <xsl:apply-templates select="@*|node()"/>
   </xsl:copy>
</xsl:template>

<xsl:template match="@*|node()">
   <xsl:copy>
     <xsl:apply-templates select="@*|node()"/>
   </xsl:copy>
</xsl:template>

<xsl:template match="text_run"/>
<xsl:template match="hyperlink_end"/>

<xsl:template match="hyperlink_begin">
  <hyperlink>
    <xsl:attribute name="xlink:href">
      <xsl:value-of
select="concat(locator_url/@protocol,'://',locator_url/@host_name)"/>
    </xsl:attribute>
    <xsl:value-of select="concat(following-sibling::text_run,'
')"/><b><xsl:value-of select="following-sibling::text_run[2]"/></b>
  </hyperlink>
</xsl:template>

</xsl:stylesheet>

When applied to the input XML:

<doc>
  <hyperlink_begin id="111" end="222">
    <locator_url protocol="http" host_name="www.sf.net"/>
  </hyperlink_begin>
  <text_run>Click</text_run>
  <text_run emphasis="bold">here.</text_run>
  <hyperlink_end id="222" begin="111"/>
</doc> 

this produces

<?xml version="1.0" encoding="UTF-16"?>
<doc>
<hyperlink xlink:href="http://www.sf.net";
xmlns:xlink="http://www.w3.org/1999/xlink";>Click <b>here.</b>
</hyperlink>
</doc>

The only problem I still see other than the input variations, is that the
namespace is still in the output element <hyperlink>; I tried to get rid of
it using exclude-result-prefixes="xlink", but that didn't help. Maybe
someone else could comment on that one?

Anyway, I hope this helps you in some way - if not, I apologize, but it has
been anyway a good exercise for me to try and solve :-)

Cheers,
<prs/>

-----Original Message-----
From: Tracy Atteberry [mailto:Tracy(_dot_)Atteberry(_at_)stellent(_dot_)com] 
Sent: Wednesday, July 07, 2004 3:40 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: RE: [xsl] XSL pattern needed for begin/end elements

Mike,

Thanks for your suggestion.  Your template code is much cleaner than what I
had posted (so I used it as an example to clean up my own!) but
unfortunately the behavior remains the same.  That is, the text_run elements
between the hyperlink_(begin/end) elements are processed twice.
Once for the hyperlink then again.

So the output looks something like this:

<cod>
 <HyperLink xlink:href="http://www.sf.net";>
   Click <b>here.</b>
 </HyperLink>
 Click <b>here.</b>
</cod>

How do we stop the intervening elements from being processed twice?

Thanks,
-Tracy

-----Original Message-----
From: Mike Trotman [mailto:mike(_dot_)trotman(_at_)datalucid(_dot_)com]
Sent: Wednesday, July 07, 2004 3:09 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] XSL pattern needed for begin/end elements


Tracy.

I haven't worked through this too carefully - but here is a pseudo-code 
method that might work in the sibling case.
It is based on the idea of selecting all following nodes for processing 
based on their next <hyperlink_end> element having matching attributes 
to the current hyperlink-begin.
(which looks like what you were intending)

<xsl:template match='hyperlink-begin'>
 <HyperLink 
xlink:href="{concat(locator_url/@protocol,'/',locator_url/@host_name)}">
<xsl:apply-templates 
select='following-sibling::*[following-sibling::hyperlink_end[1][(_at_)id=cur
rent()/@end]' 
mode='INLINK'/>
 </HyperLink>
</xsl:template>

<xsl:template match='text_run' mode='INLINK'>
<xsl:choose>
<xsl:when test='@emphaisis="bold"'>
<b><xsl:value-of select='.'/></b>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select='.'/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

There are other ways of doing the 'INLINK' mode processing - depending 
on hwo complex it gets.
E.g. - you could have separate templates matching 'text_run[(_at_)emphasis]'
etc.

You may need an additional template
e.g.
<xsl:template match='*' mode='INLINE'>
    <xsl:apply-templates/> <!-- or whatever else you want to do -->
</xsl:template> if you need to process non-sibling intervening elements.

I think something close to the above should work.

HTH.


Tracy Atteberry wrote:

Mike,

The current project is a demo for something that will eventually be 
written in C/C++.  Then as you say, we can then walk the DOM tree and 
maintain a separate context stack to help solve the problem.

For now, we can definitely assume that these elements are siblings.  In

fact, for most real source documents this will be the case.  Given that

assumption, I would love to know the not-too-difficult solution, as 
this is my immediate problem.

As for the more general case, a hyperlink may in some cases overlap 
text runs.  For example:

<doc>
 <p>
   <text_run emphasis="bold">Click 
     <hyperlink_begin id="111" end="222">
       <locator_url protocol="http" host_name="www.sf.net"/>
     </hyperlink_begin>
     here
   </text_run>
   <text_run> to download.</text_run>
   <hyperlink_end id="222" begin="111"/>
 </p>
</doc>

In fact, hyperlinks can overlap paragraphs and other document elements 
though this is rarely seen in practice.

-Tracy


-----Original Message-----
From: Mike Trotman [mailto:mike(_dot_)trotman(_at_)datalucid(_dot_)com]
Sent: Wednesday, July 07, 2004 1:26 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] XSL pattern needed for begin/end elements


If the begin and end elements are siblings at the same level then the
problem is tractable and probably not too difficult to solve.

However if they can occur at different levels then this means that one
of them is enclosed inside an element that excludes the other (I
think).

Can you give any example of a case where the begin and end elements are
not siblings at the same level?
I ask because:
a) I can't picture how this would make sense given the information that

you require them to contain
b) If one of them does occur inside an element that excludes the other
   - what would you want to to with the excluded part of this elements

content / tree?
   - If you start closing all the parent elements etc (and opening
them

again to match the orphaned end tags)
   then you are destroying the structure and meaning of the XML data
which XSLT is designed to help preserve.

I.e. if they are not siblings at the same level then the XML data
'structure' is totally inappropriate for XSLT
and the 1st thing you should do is process it using something else.

I have documents like this - and I process them by walking the DOM tree
and maintaining a separate STACK of whatever I consider my current 
context to be.
(I am doing this to detect overlap between different document layers 
marked in exactly the way you describe.)

Tracy Atteberry wrote:

 

Hi all,

I'm looking for an XSL pattern to solve the problem of going from XML
that has separate begin and end elements to one that does not.

Please, please note that I do not control either the source or target
XML formats.  If I did, this would be much easier.

Source XML snip:

<doc>
<hyperlink_begin id=3D"111" end=3D"222">
  <locator_url protocol=3D"http" host_name=3D"www.sf.net"/>  
</hyperlink_begin>  <text_run>Click</text_run>
<text_run emphasis=3D"bold">here.</text_run>
<hyperlink_end id=3D"222" begin=3D"111"/>
</doc>

Target XML example:

<cod>
<HyperLink xlink:href=3D"http://www.sf.net";>
  Click <b>here.</b>
</HyperLink>
</cod>

In my case I can assume that associated begin and end hyperlink tags
will occur as siblings -- though generally this is not the case and in

fact, this is the reason the begin and end tags are unique elements.

I have a template that /almost/ works so feel free to let me know why
it fails OR suggest a completely different solution.

Current XSL template snip:

<xsl:template match=3D"//hyperlink_begin">
  <xsl:variable name=3D"linkUrl">
      <xsl:value-of select=3D"locator_url/@protocol"/>
      <xsl:text>://</xsl:text>
      <xsl:value-of select=3D"locator_url/@host_name"/>
  </xsl:variable>
  <xsl:variable name=3D"endID" select=3D"@end"/>
  <xsl:element name=3D"HyperLink">
      <xsl:attribute name=3D"xlink:href"><xsl:value-of
select=3D"$linkUrl"/></xsl:attribute>
      <xsl:apply-templates select=3D"(following-sibling::*) except 
(following-sibling::hyperlink_end[(_at_)id=3D$endID]/following-sibling::*)"
/

  </xsl:element>
</xsl:template>

This produces the correct hyperlink but the template for text_run
elements gets called twice this way -- once inside the hyperlink, then

again as templates continue to be applied.

Any help would be greatly appreciated.  Thanks!

Tracy Atteberry

PS. I'm using Saxon 8

--+------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--


   


 


-- 
Datalucid Limited
8 Eileen Road
South Norwood
London SE25 5EJ
United Kingdom

/
tel :0208-239-6810
mob: 0794-725-9760
email: mike(_dot_)trotman(_at_)datalucid(_dot_)com

UK Co. Reg:   4383635
VAT Reg.:   798 7531 60

/


--+------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--


--+------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--


<Prev in Thread] Current Thread [Next in Thread>