xsl-list
[Top] [All Lists]

RE: [xsl] Generate identifier

2009-12-29 16:10:55

Now, I have to build a name with only containing [A-Za-z0-9] only.
My problem is that I often see characters with modifiers like 
00E0 à LATIN SMALL LETTER A WITH GRAVE
00E1 á LATIN SMALL LETTER A WITH ACUTE
00E2 â LATIN SMALL LETTER A WITH CIRCUMFLEX
00E3 ã LATIN SMALL LETTER A WITH TILDE
00E4 ä LATIN SMALL LETTER A WITH DIAERESIS ...

My questions:
  is it acceptable, from the perspective of a western 
language, to replace those characters with a character 
without modifier;
  is there a way to do this in xslt;

You can use normalize-unicode($input, 'NFD') to convert the string to
decomposed normal form; the diacritics will then be present as separate
characters, which you can detect and remove using a regular expression -
probably the same regex that removes other unwanted characters.

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay 


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>