xsl-list
[Top] [All Lists]

Re: [xsl] re: Generate identifier

2010-01-07 10:51:36
Vladimir,

I have missed the original post, but the effect of diacritics is quite 
different from language to language. Sometimes it is "only" an accent, in other 
languages it changes the sound and sometimes meaning of a character. What is a 
"Western" language? If you think of European languages, there are some that do 
not use ASCII characters at all (Cyrillic, Greek) and your method will not 
work. 

So I would just drop them or replace them with an underscore. Saves a lot of 
energy :-)

- Michael Müller-Hillebrand

Am 07.01.2010 um 14:27 schrieb Vladimir Nesterovsky:

Hello!

Proceeding with my original question.

Is there a way to decompose characters like:
æ 'LATIN SMALL LETTER AE' (U+00E6)

into a separate letters?
Are there many such characters derived from Latin (I'll be calling replace() 
if it's only one or two)?

Thanks.
--
Vladimir Nesterovsky
http://www.nesterovsky-bros.com/


I need to convert a string into an identifier.
Earlier I was using the following function:


Now, I have to build a name with only containing [A-Za-z0-9] only.
My problem is that I often see characters with modifiers like
00E0 à LATIN SMALL LETTER A WITH GRAVE
00E1 á LATIN SMALL LETTER A WITH ACUTE
00E2 â LATIN SMALL LETTER A WITH CIRCUMFLEX
00E3 ã LATIN SMALL LETTER A WITH TILDE
00E4 ä LATIN SMALL LETTER A WITH DIAERESIS
...

My questions:
 is it acceptable, from the perspective of a western language, to replace 
those characters with a character without modifier;
 is there a way to do this in xslt;
 any better option?




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>