xsl-list
[Top] [All Lists]

RE: [xsl] User-defined function for linenumber

2007-08-01 01:40:00
This feels horrendously inefficient. Why not instead implement a SAX filter
that adds the line number as an extra attribute to every element?

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: jesper(_dot_)tverskov(_at_)gmail(_dot_)com 
[mailto:jesper(_dot_)tverskov(_at_)gmail(_dot_)com] On Behalf Of Jesper 
Tverskov
Sent: 01 August 2007 09:11
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] User-defined function for linenumber

Hi list

I am trying to make a user-defined function that can return 
the linenumber of a node (yes I know Saxon has an extension 
function doing the same). So far my solution works for 
element nodes and that is good enough for now.

But I am using the analyze-string element. I would like to 
find a solution not using analyze-string in order to get a 
solution that would also work when the expressions are 
modified and transferred to Schematron. I am not sure if it 
is possible? Some clever REGEX?

If the document does not contain the element node in question 
also as text inside comments, CDATA sections and PI's, I can 
do without analyze-string. I use analyze-string only to 
neutralize false positives simply by deleting all "<" 
found inside comments, CDATA sections and PIs.

It is possible to do without analyze-string under all circumstances?

My function works like this:

I load the document as unparsed text and deletes all "<" 
from comments, CDATA sections and PIs to avoid false 
positives. I then use the node name (e.g.: "p") to split the 
string and make a new string of the items until the node 
number (e.g.: the third "p"). I then count the characters, 
delete all linefeeds, count again, and subtract to get the 
count of linefeeds until the element node in question.

My function looks like this:

<xsl:function name="please:linenumber">
        <xsl:param name="document-uri"/><!-- similar to 
document-uri() -->
        <xsl:param name="node-name"/><!-- e.g.: 'p' -->
        <xsl:param name="node-number"/><!-- e.g.: '3', that 
is the third p -->
        <xsl:variable name="unparsed" 
select="unparsed-text($document-uri)"/>
        <xsl:variable name="unparsed2">
            <xsl:analyze-string select="$unparsed"
regex="&lt;!--.*?--&gt;|&lt;!\[CDATA\[.*?\]\]&gt;|&lt;\?.*?\?&gt;"
flags="s">
            <xsl:matching-substring>
                <xsl:value-of select="replace(., '&lt;', '')"/>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
        </xsl:variable>
       <xsl:value-of
select="string-length(string-join(subsequence(tokenize($unparsed2,
concat('&lt;', $node-name)), 1, $node-number), ' ')) -
            
string-length(replace(string-join(subsequence(tokenize($unparsed2,
concat('&lt;', $node-name)), 1, $node-number), ' '), '&#xA;', ''))"/>
    </xsl:function>

Cheers
Jesper Tverskov
http://www.xmlplease.com

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--