At 04:59 AM 1/14/2004, Mike wrote:
...
Secondly, there is nothing in XSLT 1.0 that allows you to split a string
into its component words. You can do it yourself using a recursive
template (there are examples in my book XSLT Programmers Reference), or
you can use a vendor- or third-party extension function xx:tokenize().
This may seem obvious and gratuitous (for which I apologize), but I hasten
to add that this is only a *particular* notion of what a "word" is (a
substring delimited by white space), which may not be robust enough for all
purposes. For example, if your input reads
<p> the quick <em>brown</em> fox, ears up, jumps </p>
you may want your output to read not
<p> the <em>quick brown fox,</em> ears up, jumps </p>
but
<p> the <em>quick brown fox</em>, ears up, jumps </p>
which will require a more sophisticated definition of the concept of a
"word", and which will not be so tractable using basic substringing around
whitespace (or a simple tokenize function either, FTM).
This kind of thing is not impossible to work around in most real-world
cases, but since XSLT 1.0 is not designed for up-conversion, it can get
pretty hairy.
But it all depends on the actual processing requirements for the data.
Caveat lector.
Cheers,
Wendell
======================================================================
Wendell Piez
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list