xsl-list
[Top] [All Lists]

Re: [xsl] Getting the Base Character of Character with Diacritic

2006-09-19 01:57:13
Michael Kay wrote:
Following up on suggestions from others, if NFKD is supported then the
following should work reasonably well for European languages:

replace(normalize-unicode($in, 'NFKD'), '[̀-ͯ]', '')
You are right. In a way, I thought that the Modifier characters (x02B0 - x02FF) could also be used for "modifying" a certain character (I mentioned the macron and circumflex in an earlier post as 0x02C9 and 0x02C6, but these were wrong). They do include macron, diaeresis, circumflex etc. but as I understand now, these are not used for "modifying/combining letters" but for "modifying spacing" (i.e: quotes etc), and as a result do not influence normalization.

-- Abel Braaksma


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--