xsl-list
[Top] [All Lists]

RE: [xsl] Unicode character blocks in strings

2009-05-26 09:18:08

Try:

<xsl:analyze-string regex="\p{{IsCJKUnifiedIdeographs}}">
<xsl:matching-substring>
  <out><xsl:value-of select="."/></out>
</xsl:matching-substring>
<xsl:non-matching-substring>
  <out><xsl:value-of select="."/></out>
</xsl:non-matching-substring>
</xsl:analyze-string>

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay  

-----Original Message-----
From: tom tom [mailto:tomxsllist(_at_)hotmail(_dot_)com] 
Sent: 26 May 2009 14:08
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Unicode character blocks in strings


I have a string containing a mix of Chinese and Latin 
characters, eg 阿根廷甲型H1N1流感确. 
I wish to return a nodeset containing the following kind of structure:
 
 

  阿根廷甲型
  H1N1
  流感确

 
Where H1N1 falls into the BasicLatin unicode character block 
and the other two strings can be categorized as CJKUnifiedIdeographs.
 
Can anyone suggest the cleanest way to do this using XSLT 2? 
 
Tom
 
_________________________________________________________________
View your Twitter and Flickr updates from one place ?C Learn more!
http://clk.atdmt.com/UKM/go/137984870/direct/01/

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--