Re: [xsl] Confused about entities

Gary,

At 09:56 AM 3/14/2006, you wrote:

On 14/03/06, andrew welch <andrew(_dot_)j(_dot_)welch(_at_)gmail(_dot_)com> 
wrote:

> If you are outtputting as XML where are the &nbsp;'s coming from?
>
> Are you writing them out manually in the stylesheet - as is in &amp;nbsp;?

Nope but thanks for the clue. The *input* is double escaped (I must
stop trusting a browser when viewing the output). That is I've
actually got &amp;nbsp; as the input which explains why there is no
conversion going on. Later on in a different transform I do a
saxon:parse which then obviously can't find the reference to a &nbsp;
and therefore throws an exception.


Oh the input is *double* escaped? Fun.

Two available options:

1. Isolate a transformation step to write a file in which thedouble-escaping is removed, in effect by resolving the "& amp;"entity to "&" so the file presents an honest "& nbsp;" -- then parseas normally. But this step in the pipeline has either to write a fileor pass the data through, say, a SAX filter -- it has to serializethe data somehow, for reparsing: it can't work completely withinXSLT's world of trees (the logical view). As Mike just observed,you're having to work with the lexical layer of the markup before itrepresents what it's supposed to represent (what you, but not thecomputer, knows it "actually" represents through the double-escaped entities).

2. Use string processing. Since you're using XSLT2.0 this is areasonable option. A regular expression could be used to match thefake entities and turn them into something more useful. Probably thisprocess would have to write a file too, to be parsed again, unlessyou used some kind of internal lookup table to take the place of theset of entity declarations (which are only available to a parser).

I hesitate to say more, as XSLT 2.0 gives much better facilities forhandling such things than 1.0 did. (I could tell you about 1.0tricks, but why?) But since I haven't tried them out myself, I canonly direct your attention to them.

Note that both these approaches assume that your files actuallyparse. The error message you reported before suggests they don't.

But maybe you have the unescaping thing working and need to invokethe entity declarations on the output to get it to parse properly --that error message was upon parsing the *output*? (You're not theonly one confused now.)


Cheers,
Wendell




======================================================================
Wendell Piez                            
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--