RE: [xsl] Unicode character blocks in strings


Try:

<xsl:analyze-string regex="\p{{IsCJKUnifiedIdeographs}}">
<xsl:matching-substring>
  <out><xsl:value-of select="."/></out>
</xsl:matching-substring>
<xsl:non-matching-substring>
  <out><xsl:value-of select="."/></out>
</xsl:non-matching-substring>
</xsl:analyze-string>

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay

-----Original Message-----
From: tom tom [mailto:tomxsllist(_at_)hotmail(_dot_)com] 
Sent: 26 May 2009 14:08
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Unicode character blocks in strings

I have a string containing a mix of Chinese and Latin 
characters, eg 阿根廷甲型H1N1流感确. 
I wish to return a nodeset containing the following kind of structure:

  阿根廷甲型
  H1N1
  流感确

Where H1N1 falls into the BasicLatin unicode character block 
and the other two strings can be categorized as CJKUnifiedIdeographs.

Can anyone suggest the cleanest way to do this using XSLT 2? 

Tom

_________________________________________________________________
View your Twitter and Flickr updates from one place ?C Learn more!
http://clk.atdmt.com/UKM/go/137984870/direct/01/

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--