xsl-list
[Top] [All Lists]

Re: Can someone help me understand why this isn't working?

2005-01-19 17:33:23
Hey Luke,

At 06:08 PM 1/19/2005, you wrote:
I have a basic example I am trying to get working to convert ;<P> to
<P>.

Is this by design? Do you have the alternative of getting a different form of input?

Here is the XML:

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type="text/xsl" href="hello.xsl"?>
<greeting>&lt;P&gt;Hello, world!&lt;/P&gt;</greeting>

Here is the XSL:

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="html"/>
<xsl:template match="/">
<html>
<head>
<title>Today's greeting</title>
</head>
<body>
<xsl:apply-templates select="greeting"/>
</body>
</html>
</xsl:template>
<xsl:template match="P">
<em><xsl:apply-templates/></em>
</xsl:template>
</xsl:stylesheet>

I am guessing that the value of greeting selected in the apply-templates
call contains &lt;P&gt;Hello, world!&lt;/P&gt;

The string value of the greeting element in the source (as of the text node child of that element) is, yes, the sequence of characters you are showing.

(Keeping in mind that in that sequence &lt; is < etc. - as nodes in the tree, the character references have been resolved. That is, the string value is actually "<P>Hello world!</P>" because that's what you had in your input once it was parsed into the tree.)

This is a far cry from a p element node. The only element is the greeting element, whose text value this is. No template you can write will pick up an element that doesn't exist. :->

I was hoping because the output type was HTML a result tree containing
<p>Hello, world!</p> would be created that my match="P" would process (my
apologies if my terminology is not correct, still getting up to speed with
this).

The terminology is fine, but alas XSLT is not designed to work this way. The processor does not re-parse strings on the fly into more nodes in the tree. (Presumably you hide the XML with entities because you don't want it processed.) There is nothing here to be matched: the apply-templates will only find the text node. To handle it requires some kind of second parse.

There seems to be an emerging industry in handling such "pseudo-XML". Generally this practice is frowned upon in polite circles, because it violates the spirit of XSLT node-think, tempts one to the heresy of tag-writing (much as I might indulge in that vice in private, it's not something I'd announce to the world), and is just generally nasty work. (Expect things to break frequently whenever things fail to parse, which can be often in such uncontrolled environments.) Stay clean, we urge: process nodes and don't try to reparse strings.

If you have to do it, there are generally two approaches:

1. Use an XSLT processor that supports disable-output-escaping, and use it to write this content directly to files, where you can try to parse it (and succeed if you are lucky). If you are smart you might even do useful transformations at that stage. But you will probably have to clean up by hand.

2. Use an extension function such as Saxon's, and try to parse those strings in place.

Am I thinking about this the wrong way? Any advice would help.

Optimally, you'd probably avoid pseudo-markup and find a way to enjoy the full benefits of XML.

If that is not an option, work the data over in two passes (or more) as described above.

Good luck,
Wendell


___&&__&_&___&_&__&&&__&_&__&__&&____&&_&___&__&_&&_____&__&__&&_____&_&&_
    "Thus I make my own use of the telegraph, without consulting
     the directors, like the sparrows, which I perceive use it
extensively for a perch." -- Thoreau

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--