xsl-list
[Top] [All Lists]

Re: [xsl] Some entities output as entities and others as chars not looking good in the html.. why?

2011-12-13 16:05:54
On 13/12/2011 18:02, Alex Muir wrote:
Hi,

I've got some currently inefficient code that is to convert some
characters to entites for html output like so. Will move this into a
analyze string function with a lookup later, but the problem other
than speed is that some of the characters are not being replaced as
entities.

...replace(replace(replace(replace(replace(replace(replace(replace(
       
$arg,'’','’'),'“','“'),'”','”'),'®','®'),'™','™')
       ,'—','—'),'·','·'),'',''),'–','–'),'é','é')
       ,'¾','¾'),'þ','þ'),'o','o'),'•','•'),'â','â')
       ,'ü','ü'),'Ü','Ü'),'€','€'),'†','†'),'x','x')
       ,'¨','¨'),'§','§'),'—','–'),'ã','ã'),'━','—')
       ,'ă','ă'),'©','©'),'í','í'),'è','è'),'ç','ç')
       ,'á','á'),'‘','‘'),'ý','ý'),'‡','‡'),'£','£')
       ,'⅞','⅞'),'¼','¼'),'½','½'),'¾','¾'),'´','´')
       ,'ú','ú'),'ñ','ñ'),'ð','ð'),'ø','ø'),'ó','ó')
       ,'ê','ê'),'à','à')"/>

If I copy the above into some text to use as test data one sees good
output as a pair of entities like’','’ and undesired output
which looks bad in the html as '','' a pair of characters with the
entity converted to a char.

select="$arg,'’','’'),'“','“'),'”','”'),'®','®')
     ,'™','™'),'—','—'),'·','·'),'',''),'–','–')
     
,'é','é'),'¾','¾'),'þ','þ'),'o','o'),'•','•'),'â','â'),'ü','ü'),'Ü','Ü')
     
,'€','€'),'†','†'),'x','x'),'¨','¨'),'§','§'),'—','–')
     
,'ã','ã'),'—','—'),'ă','ă'),'©','©'),'í','í'),'è','è'),'ç','ç'),'á','á')
     ,'‘','‘'),'ý','ý'),'‡','‡'),'’','’'),'£','£')
     ,'“','“'),'”','”'),'•','•'),'⅞','⅞'),'¼','¼')
     ,'½','½'),'¾','¾'),'´','´')"

So why is this happening?

Btw if anyone can come teach some software engineering courses next
term at the university of gambia it would be welcome. I've injured my
ankle and won't be able to lecture so there is a gap that needs
filling. It's fairly easy to work remotely here on contracts as long
as you bring about 3 laptop batteries with you.

Thanks

note that numeric character references are not entity references in XML (and get expanded at different times to entity references in some circumstances). Character references always refer to unicode code points so for example & # 1 3 7 ; refers to the control character
SET TRANSMIT STATE
which probably is not the character you intended?
some Microsoft encodings use characters in that range but (if th encoding has been correcty declared in the input file) such characters will be converted to their Unicode code points before XSLT sees the input data.

There question is really, what encoding is your input in, and what encoding do you wish your output to be> If these encodings are declared then probably you don;t need to do any character transformations at all. (If you do, then using translate() or character maps is probably simpler than using regexp for this.,

David





--
google plus: https:/profiles.google.com/d.p.carlisle

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>