Hello all,
I have been trying to solve this problem for a few days now and I have
had no luck. I am hoping someone here can help me out with this.
I need to parse XHTML and transform it into another XML format. I am
sure that the XHTML is valid and well formed (I am running it through
HTMLTidy). The first problem I encountered was the notion of mixed
elements. Something like...
<div>
My name is <b>bob</>. What is yours?
<ul>
<li>foo</li>
<li>bar</li>
</ul>
</div>
I found a utility script on the web that can turn mixed content into
element content. I am guessing some of you have seen this script
before.
<xsl:template match="text()[normalize-space(.)][../*]">
<xsl:element name="textnode">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
This makes the above post look like...
<div>
<textnode>My name is </textnode><b>bob</><textnode>. What is
yours?</textnode>
<ul>
<li>foo</li>
<li>bar</li>
</ul>
</div>
However, what I would really like to do is have the bold tags included
inside of the textnode tag so that it looks like...
<div>
<textnode>My name is <b>bob</>. What is yours?</textnode>
<ul>
<li>foo</li>
<li>bar</li>
</ul>
</div>
In other words I would like to treat the <b> element as text and not an
element. There is a finite set of tags I would like to be treated as
simple text. These are considered in-line elements in html.
<b><i><em><strong><u>
An alternative, and better solution, would be wrapping all text through
the document in the textnode element including the in-line elmements
mentioned above. The xml I will finally output from the transformation
of the xhtml requires all text be wrapped in a special displaytext tag
including the in-line elements mentioned above. By placing every piece
of text, including the in-line text tags above, in a textnode I could
easily pass the document through another template that says...
<xsl:template match="textnode[normalize-space(.)]">
<xsl:element name="displaytext">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
This would make things much easier.
Below are the xsl processor and xsl version. I am not tied to Saxon if
another processor could do the job, provided it can be used within Java
and ports across platforms (windows, unix, etc).
Processor: Saxon8B
XSL Version: 2.0
Thanks in advance for your help.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--