xsl-list
[Top] [All Lists]

Re: [xsl] Global parameters with UTF-8 characters and ???s <Disregard Previous>

2006-08-03 02:24:35
On 8/2/06, Michael Kay <mike(_at_)saxonica(_dot_)com> wrote:
> Is this the right solution?  Or does it just point out what
> the issue is?

It's a viable workaround. But it suggests that there is some kind of
configuration problem somewhere, perhaps with the web server.

It effectively takes encoding out of the equation, the ascii
characters & #nnn; are written to disk instead of a single unicode
character, and the browser reads ascii instead of the single unicode
character.

If you can see the correct characters in the browser now then it
suggests they are contained in the font that's being used, and the
problem lies with the file being written in one encoding and read in
another.  When the encoding doesn't contain a mapping for a given byte
sequence a question mark ? is used to mean "no mapping".

If you use a hex editor at every stage of the process to find out when
the bytes for the character ? are x3F (meaning the ? really is a ? and
its not just your viewer) then you'll know that the last stage was the
culprit.

If you are using Java then it's often the case of the setting default
platform encoding to UTF-8:

System.setProperty("file.encoding", "UTF-8"))

This ensures any operations that involve encodings (where an optional
encoding agument hasn't been specified, eg getBytes()) will use UTF-8.
If you don't specify this then ISO-8859-1 is used (on Windows
platforms anyway, afaik).

cheers
andrew

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--