xsl-list
[Top] [All Lists]

RE: Mistake in tokenizing under Saxon 8.2

2005-01-21 04:49:17

I am also wondering whether I am able to 'look forward' when 
tokenizing
and perform a different operation depending on what the _next_ token
will be?

You can do something like

<xsl:variable name="tokens" select="tokenize(...)"/>
<xsl:for-each select="$tokens">
   <xsl:variable name="p" select="position()"/>

   <xsl:if test="$tokens[$p+1] = xyz">

Michael Kay
http://www.saxonica.com/


Cheerio,
Nic.

mike(_at_)saxonica(_dot_)com 21/01/2005 10:40:49 >>>
The spaces that you get in your output are not being copied from the
input,
they are being generated by virtue of the rule that a single space is
inserted between adjacent atomic values delivered in the result of a
content
constructor. This space isn't inserted between a string and a node,
only
between two strings.

(The reason for this rule is primarily for the case where you are
generating
list-valued simple content, e.g. ("red", "blue", "green"). It's less
satisfactory when generating complex content. It means that you need
to
understand the rather subtle distinction between a string and a text
node:
<xsl:value-of select="'a'"/> outputs a text node, while <xsl:sequence
select="'a'"/> outputs a string. <xsl:copy-of/> produces whatever it
is
given, which in this case is a string. Text nodes are output without
generating separator spaces.)

I think that if you output a text node containing a single 
space either
side
of the <a> element, you will get the required effect. You can do this
with

<xsl:text> </xsl:text> 

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: Nicholas Hemley 
[mailto:Nicholas(_dot_)Hemley(_at_)lhb(_dot_)scot(_dot_)nhs(_dot_)uk] 
Sent: 21 January 2005 09:50
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com 
Subject: [xsl] Mistake in tokenizing under Saxon 8.2

Hello,

I presume that I have made a mistake somewhere in the stylesheet
when
using the tokenize function under Saxon 8.2 - for some reason I am
loosing the whitespace chars around the matched regular expression.

For example, the following pattern:
text text [link,alt,link_text] text

should be transformed to:

text text <a href="link" alt="alt">link text</a> text

BUT

I am loosing the whitespasce characters around the <a> as follows:

text text<a href="link" alt="alt">link text</a>text
              ^                                               
         
 ^
Why is this please? All the other whitespace chars are copied OK,
even
though I am tokenising on whitespace.

If I use a &nbsp; in the stylesheet to compensate for the 
loss, it adds
two spaces, not one, which is wierd, so this is not currently a
viable
solution.

Any input appreciated!

Many thanks,
Nic.

Appendix: Stylesheet Snippet

  <xsl:template match="/html/body/P|p">
    <!-- copy node plus select contents -->
    <xsl:copy>

              <xsl:variable name="tokens"
select="tokenize(.,'\s+')"/>

              <xsl:for-each select="$tokens">

                <xsl:choose>
                  <xsl:when test='matches(.,"\[(.*),(.*),(.*)\]")'>

                    <xsl:variable name="elValue" select="."/>

                      <xsl:analyze-string select="$elValue"
regex="\[(.*),(.*),(.*)\]">

                        <xsl:matching-substring>
                          <a href="{regex-group(1)}">
                                  <xsl:attribute name='alt'>
                                    <xsl:value-of
select='replace(regex-group(3), "_"," ")'/>
                                  </xsl:attribute>
                                  <xsl:value-of
select='replace(regex-group(2), "_"," ")'/>
                           </a>
                        </xsl:matching-substring>

                      </xsl:analyze-string>

                  </xsl:when>
                  <xsl:otherwise>
                    <xsl:copy-of select="."/>
                  </xsl:otherwise>
                </xsl:choose>
              </xsl:for-each>
    </xsl:copy>
  </xsl:template>




**********************************************************************
The information contained in this message may be confidential 
or legally privileged and is intended for the addressee only, 
If you have received this message in error or there are any 
problems please notify the originator immediately. The 
unauthorised use, disclosure, copying or alteration of this 
message is strictly forbidden.

**********************************************************************



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ 
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list 
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ 
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



**********************************************************************
The information contained in this message may be confidential 
or legally privileged and is intended for the addressee only, 
If you have received this message in error or there are any 
problems please notify the originator immediately. The 
unauthorised use, disclosure, copying or alteration of this 
message is strictly forbidden.
**********************************************************************


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



<Prev in Thread] Current Thread [Next in Thread>