xsl-list
[Top] [All Lists]

RE: Problems with mixed content and inline elements when transforming XHTML into another XML format

2006-02-27 02:52:37

The function is-inline is generating a sequence with several elements.
I am wondering if the way this is done is most efficient. The list is
getting a little longer than I expected and I may add a few more
elements before it is all said and done. Is there a way I can declare
the set of tags I want to be included rather than building up this big
conditional?

Very often with a long list like this, you find that the elements are all
members of the same substitution group in the schema. So with a schema-aware
processor, you can use the construct 

. instance of schema-element(inline-element)

to select them all.

This may or may not be useful in your situation; but it's just one example
of the ways that making stylesheets schema-aware can improve the robustness
of your code.

Michael Kay
http://www.saxonica.com/



For example: 

INSTEAD OF:
<xsl:sequence select="($node instance of text() and
normalize-space($node)) 
            or
$node[self::u|self::b|self::i|self::strong|self::span|self::em
|self::br|self::img|self::font|self::a]"/>

COULD I:
Declare a list of all tags I want considered...
<xsl:variable name="inlineElements"
select="u,b,i,strong,span,em,br,img,font,a"/>

<xsl:sequence select="($node instance of text() and
normalize-space($node)) 
            or inList($node, $inlineElements)"/>

Realizing I just made up this mythical function "inList". I am trying
to make it easier to add/subtract from the list and maybe improve the
readability some. Is there anything close to this that will perform
well?

Thanks again for help.

----
In case anyone else is looking for the final script I have included it
here. I am guessing all of this was rather obvious to all of you. 

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0"
    xpath-default-namespace="http://www.w3.org/1999/xhtml";
    xmlns:f="http://localhost";
    xmlns:xs="http://www.w3.org/2001/XMLSchema";
    xmlns:my="http://localhost/markup.xsd";>
    
    <xsl:output indent="yes" method="xml" encoding="UTF-8"
standalone="no"/>
    
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:for-each-group select="@*|node()"
group-adjacent="f:is-inline(.)">
                <xsl:choose>
                    <xsl:when test="current-grouping-key()">
                        <xsl:element name="my:textnode">
                            <xsl:copy>
                                <xsl:apply-templates
select="current-group()"/>
                            </xsl:copy>
                        </xsl:element>
                    </xsl:when>
                    <xsl:otherwise>                        
                        <xsl:apply-templates 
select="current-group()"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each-group>
        </xsl:copy>    
    </xsl:template>
    
    <xsl:function name="f:is-inline" as="xs:boolean">
        <xsl:param name="node" as="node()"/>
        <xsl:sequence select="($node instance of text() and
normalize-space($node)) 
            or
$node[self::u|self::b|self::i|self::strong|self::span|self::em
|self::br|self::img|self::font|self::a]"/>
    </xsl:function>
</xsl:stylesheet>

This will wrap all elements that are in the select clause of the
sequence above in a tag called called <my:textnode>.

--- Tony Kinnis <kinnist(_at_)yahoo(_dot_)com> wrote:

Hello all, 

My apologies in advance for reposting. I sent this question a few
days
ago and didn't receive a response. Maybe it was simply over 
looked or
even ignored. :) In case it is the former I am sending it again.

Thanks in advance for any help you can give. See below for the
posting.

-- repost--<<

Sorry to keep asking about this problem but I am still 
having issues.
The change you mention below does remove the error but now it never
hits the block of code to wrap the elements in the textnode. It is
simply outputting the input verbatim. Stepping through it in a
debugger
it shows that it steps into the for-each-group statement then right
into the otherwise clause, outputs the entire document then it exits
processing as complete. It seems as though it is missing a statement
in
the otherwise clause that causes it to recurse on the elements. I
tried a couple of different things with that but they were all wrong
and didn't produce the results I wanted.

See the previous message below for the input, stylesheet and desired
output because I think something is missing here or I am not asking
my
question correctly. For convenience here is the entire stylesheet
including the change you suggested. Once again thanks for your help.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0"
    xpath-default-namespace="http://www.w3.org/1999/xhtml";
    xmlns:f="http://whatever";
    xmlns:xs="http://www.w3.org/2001/XMLSchema";>
    
    <xsl:template match="/">
        <xsl:copy>
            <xsl:for-each-group select="node()"
group-adjacent="f:is-inline(.)">
                <xsl:choose>
                    <xsl:when test="current-grouping-key()">
                        <textnode><xsl:copy-of
select="current-group()"/></textnode>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:copy-of select="current-group()"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each-group>
        </xsl:copy>   
    </xsl:template>
    
    <xsl:function name="f:is-inline" as="xs:boolean">
        <xsl:param name="node" as="node()"/>
        <xsl:sequence select="$node instance of text() or

$node[self::u|self::b|self::i|self::strong|self::span|self::em
|self::br]"/>
    </xsl:function>
</xsl:stylesheet>

--- Michael Kay <mike(_at_)saxonica(_dot_)com> wrote:


I keep getting this error...

Description: A sequence of more than one item is not 
allowed as
the
first argument of f:is-inline()
URL: http://www.w3.org/TR/xpath20/#ERRXPTY0004

Sorry, the code should have said group-adjacent="f:is-inline(.)"

Michael Kay
http://www.saxonica.com/


In case this this matters I am debugging this using the Oxygen
editor
for the mac. The processor I have selected is Saxon8B. Once
again
help
is much appreciated.

To make this easier here is the full xsl doc, input I am
testing
and
desired output....

XSL document...
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0"
    xpath-default-namespace="http://www.w3.org/1999/xhtml";
    xmlns:f="http://whatever";
    xmlns:xs="http://www.w3.org/2001/XMLSchema";>
    <xsl:template match="/">
        <xsl:copy>
            <xsl:for-each-group select="node()"
                group-adjacent="f:is-inline(node())">
                <xsl:choose>
                    <xsl:when test="current-grouping-key()">
                        <textnode><xsl:copy-of
select="current-group()"/></textnode>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:copy-of 
select="current-group()"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each-group>
        </xsl:copy>    
    </xsl:template>
    
    <xsl:function name="f:is-inline" as="xs:boolean">
        <xsl:param name="node" as="node()"/>
        <xsl:sequence select="$node instance of text() or
$node[self::u|self::b|self::i|self::strong|self::span|self::em
|self::br]"/>
    </xsl:function>
</xsl:stylesheet>

XHTML Document...

<?xml version="1.0" encoding="utf-8"?>
<html xmlns="http://www.w3.org/1999/xhtml";>
    <head>
        <meta name="generator" content="HTML Tidy, see
www.w3.org"/>
        <title>The Title Is</title>
    </head>
    <body>
        <ul id="bar">
            <li/>
            <li>foo<br/> after break <div/> after empty
div</li>
            <li>bar<strong>baz</strong></li>
        </ul>
        <ol>
            <li>Item 1</li>
            <li>Item 2</li>
        </ol>
        <p><span>foo</span><br/> asdf <b>bold another</b>
            and <strong>a strong item</strong>
        </p>
        <div>
            Content of a <b>div tag</b> here.
            <ul>
                <li>
                    Nested List Item 1
                </li>
                <li>
                    Nested List Item 2
                </li>
            </ul>
            Now list is done
        </div>
    </body>
</html>

Desired output...
<?xml version="1.0" encoding="utf-8"?>
<html xmlns="http://www.w3.org/1999/xhtml";>
    <head>
        <meta name="generator" content="HTML Tidy, see
www.w3.org"/>
        <title><textnode>The Title Is</textnode></title>
    </head>
    <body>
        <ul id="bar">
            <li/>
            <li><textnode>foo<br/> after break
</textnode><div/><textnode> after empty div</textnode></li>
           
<li><textnode>bar<strong>baz</strong></textnode></li>
        </ul>
        <ol>
            <li><textnode>Item 1</textnode></li>
            <li><textnode>Item 2</textnode></li>
        </ol>
        <p><textnode><span>foo</span><br/> asdf <b>bold
another</b>
            and <strong>a strong item</strong></textnode>
        </p>
        <div>
            <textnode>Content of a <b>div tag</b>
here.</textnode>
            <ul>
                <li>
                    <textnode>Nested List Item 1</textnode>
                </li>
                <li>
                    <textnode>Nested List Item 2</textnode>
                </li>
            </ul>
            <textnode>Now list is done</textnode>
        </div>
    </body>
</html>


=== message truncated ===


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--





--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--