Hi,
I'm having some trouble using the Xalan 2 Java parser for
parsing some XML
files I have that include non-ASCII characters. In which I
have characters
like right-tik ' left-tik long dash, etc. When I use Xalan UTF-8
encoding, it will make these characters into garbeled mess,
like Â$(A for right tick ,etc.
You are viewing the result document using editor/viewer that supports UTF-8,
right? I can't stress this enough...
The SAX2SAX.java example provided with Xalan doesn't do too much help
either. In which it seems like it will always output a
supposed 'UTF-8'
format even though I change the <xsl:output
encoding="ISO-8859-1'/> etc.
Does anyone know how to get the Xalan parser to properly
transform these
characters to their proper hex value?
Xalan just generates the result tree, it's the serializer that writes the
actual outputstream - thus, if the Xalan default serializer doesn't output some
characters as character entity references and that doesn't suit you, use your
own serializer.
Jarno
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list