xsl-list
[Top] [All Lists]

RE: Re: text() word lists

2004-02-09 04:27:37

Not that I understand it,
but ( and ) seem to be included Michael?
 <word>)   -   71</word>
 <word>(this   -   11</word>
  

Is it modify by updating 
     for $w in tokenize(string(.), '[\s.?!,]+')[.] return 
line?

for $w in tokenize(string(.), '[\s.?!, )(]+')[.] return 
seems to work.

I only spent five minutes on this: producing a decent natural language
tokenizer takes a little bit longer than that! Obviously its easy to
write a more intelligent regex, I was only trying to illustrate the
principles.

Michael Kay


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>