RE: Re: How to match a element + part of an immediate text sibling?

At 04:59 AM 1/14/2004, Mike wrote:

...
Secondly, there is nothing in XSLT 1.0 that allows you to split a string
into its component words. You can do it yourself using a recursive
template (there are examples in my book XSLT Programmers Reference), or
you can use a vendor- or third-party extension function xx:tokenize().

This may seem obvious and gratuitous (for which I apologize), but I hastento add that this is only a *particular* notion of what a "word" is (asubstring delimited by white space), which may not be robust enough for allpurposes. For example, if your input reads


<p> the quick <em>brown</em> fox, ears up, jumps </p>

you may want your output to read not

<p> the <em>quick brown fox,</em> ears up, jumps </p>

but

<p> the <em>quick brown fox</em>, ears up, jumps </p>

which will require a more sophisticated definition of the concept of a"word", and which will not be so tractable using basic substringing aroundwhitespace (or a simple tokenize function either, FTM).

This kind of thing is not impossible to work around in most real-worldcases, but since XSLT 1.0 is not designed for up-conversion, it can getpretty hairy.

But it all depends on the actual processing requirements for the data.Caveat lector.


Cheers,
Wendell


======================================================================
Wendell Piez                            
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list