Now, I have to build a name with only containing [A-Za-z0-9] only.
My problem is that I often see characters with modifiers like
00E0 à LATIN SMALL LETTER A WITH GRAVE
00E1 á LATIN SMALL LETTER A WITH ACUTE
00E2 â LATIN SMALL LETTER A WITH CIRCUMFLEX
00E3 ã LATIN SMALL LETTER A WITH TILDE
00E4 ä LATIN SMALL LETTER A WITH DIAERESIS ...
My questions:
is it acceptable, from the perspective of a western
language, to replace those characters with a character
without modifier;
is there a way to do this in xslt;
You can use normalize-unicode($input, 'NFD') to convert the string to
decomposed normal form; the diacritics will then be present as separate
characters, which you can detect and remove using a regular expression -
probably the same regex that removes other unwanted characters.
Regards,
Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--