| 
 
 Re: [xsl] recognize character entities
2006-08-30 02:41:39
 
Florent Georges wrote:
 
    <xsl:variable name="entity.values"
                  select="('�...', '�...', ...)"/>
    
  Perhaps it is easier, if I may suggest so, to use regular expressions. I 
think they would require a lot less work to create, because often the 
character entities used for MathML are inside ranges. Looking around at 
the entity tables on 
http://www.w3.org/TR/2003/REC-MathML2-20031021/chapter6.html#chars.entity.tables, 
I found that most sets are more a less complete parts from the Unicode 
4.0 specification.
 For instance, almost all characters in the range 0x02200 - 0x022FF are 
included (Mathematical Operators subset in Unicode). The regular 
expression for this is: [\x2200-\x22FF]. I'm not sure if processor dig 
this too: Mathematical symbols ought to be matched with the simple 
expression: \P{Sm}.
 Similar constructs are available for Greek and Cyrillic: \P{IsGreek} and 
\P{IsCyrillic}.
 Some ranges may be too wide, but perhaps there is little chance your 
code contains symbols not used by MathML, but available to Unicode.
 Some characters are specified by MathML with a combining diacritical 
mark. I think you will have to list them separately in your regular 
expression. Same is true for the "normal" Latin-1 characters that are 
part of MathML, like &, á, Â etc.
 Using this approach you do not have to wonder if a characther entity is 
written using its numeric equivalent, the hexadecimal notation or the 
named notation.
 Of course, it will take a few hours to construct your regex, but I think 
it will be much easier to maintain than a list of all entity values. 
And, forgot to say, you can only use it with XSLT 2.0 capable processors.
Hope this helps,
Cheers,
Abel Braaksma
http://abelleba.metacarpus.com
--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
 
 
| <Prev in Thread] | 
Current Thread | 
[Next in Thread>
 |  
- Re: [xsl] recognize character entities, (continued)
 
- Re: [xsl] recognize character entities, Florent Georges
 - Re: [xsl] recognize character entities, Florent Georges
 - Re: [xsl] recognize character entities, Abel Online
 
- Re: [xsl] recognize character entities,
Abel Online <=
 - Re: [xsl] recognize character entities, Abel Online
 
- Re: [xsl] recognize character entities, Frank Marent
 - Re: [xsl] recognize character entities, Florent Georges
 - Re: [xsl] recognize character entities, David Carlisle
 
- Re: [xsl] recognize character entities, David Carlisle
 
- Re: [xsl] recognize character entities, Owen Rees
 - Re: [xsl] recognize character entities, Frank Marent
 - Re: [xsl] recognize character entities, David Carlisle
 - Re: [xsl] recognize character entities, Frank Marent
 - Re: [xsl] recognize character entities, David Carlisle
 
 
 |  
  
 | 
 
 
 |