xsl-list
[Top] [All Lists]

Re: [xsl] Encoding issues with document() function

2006-11-04 02:51:15
"Pankaj Bishnoi" <pankaj(_dot_)bishnoi(_at_)adeptia(_dot_)com> writes:

Hi All
        I am having a xsl in which i use XSLT document() function. The
problem i am facing is that the xml file i am trying to read by using
document() function is having some Unicode characters and the exception
thrown at transformation time is ::

SystemId Unknown; Line #133;Column #104; Can not load requested doc: An
invalid XML character(Unicode: 0x2) was found in the element content of the
document

The source xml file is having encoding UTF-8. I tried to search the web for
this issue and one alternate specified is to replace thos '0x2' character.
Now there can be other characters as well that might come in other scenarios
such as 0x1,0x13 etc. Now my quesstion is is there any encoding that
supports all these characters?

Is there any way out for this issue . Any help will be highly
appreciated.

You don't mention what processor you're using...

But document() can only do the simplest thing which is to presume that
the entity it's been asked to read will be encoded correctly.

It sounds like it's very likely that your entity is NOT utf-8 encoded
correctly. It happens. Even with big websites (I spent ages debugging
O'Reilly's XML RSS feeds once because they were full of encoding
bugs).

There are 2 alternatives:

1. ask the people who control the entity to fix the encoding.

2. write a new document() function which fixes arbitary encoding
   problems and make it available in your processor.


-- 
Nic Ferrier
http://www.tapsellferrier.co.uk   for all your tapsell ferrier needs

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--