Hi Joerg,
If you are outputting UTF-8 then your a-umlaut will be written as
a two-byte sequence. If your output is serialised XML or HTML then
this is fine, as there are headers which can declare that the
content is UTF-8 encoded. If, however, you are writing a plain text
file (as you say you are), there is no way for the process which
reads it in to determine whether it is UTF-8, ASCII, iso-8859-1 or
whatever.
The first string you give would appear to indicate that there are,
as expected, two bytes in the output stream where you expect your
a-umlaut character to appear, and the program you are using to
view this file doesn't understand this.
When you ask XSLT to output using iso-8859-1, it know that in this
encoding there is a single byte representation of a-umlaut, and it
uses this and it is correctly intpretted by your viewing program.
So, if you must write out UTF-8 (and it's quite possible that you
may be able to survive with iso-8859-1 if you're just using a few
simple accented characters, such as French and German), then you
need to tell your viewing program that the byte stream you are
feeding it is a UTF-8 encoded character stream.
Regards,
Dan.
--
Danny Yates
Technical Architect
Abbey National Treasury Services
E-mail: Danny(_dot_)Yates(_at_)ants(_dot_)co(_dot_)uk
Phone: +44 20 7756 5012
Fax: +44 20 7612 4342
-----Original Message-----
From: Joerg Heinicke [mailto:joerg(_dot_)heinicke(_at_)gmx(_dot_)de]
Sent: 14 November 2002 10:03
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] encoding of text files
Hello,
I have a problem with generated java/text files and their encoding.
From a autotest description a java file is generated. If i use default
output encoding (UTF-8), the German umlauts in the output looks like this
one:
"geändert"
If I use ISO-8859-1 it's correct:
"geändert"
I use Netbeans, which knows in general UTF-8 (with XML), but I don't know
whether it knows UTF-8 in text files. At least the output of the java file is
also wrong.
It should be possible to have text files in UTF-8, shouldn't it?? What can then
be the problem? How are text files marked as UTF-8?
With pure XML encoding seems simple, but what about text files. Can somebody
enlighten me or point to some resources?
Joerg
--
System Development
VIRBUS AG
Fon +49(0)341-979-7419
Fax +49(0)341-979-7409
joerg(_dot_)heinicke(_at_)virbus(_dot_)de
www.virbus.de
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
***************************************************************************
This communication (including any attachments) contains confidential
information. If you are not the intended recipient and you have received this
communication in error, you should destroy it without copying, disclosing or
otherwise using its contents. Please notify the sender immediately of the
error.
Internet communications are not necessarily secure and may be intercepted or
changed after they are sent. Abbey National Treasury Services plc does not
accept liability for any loss you may suffer as a result of interception or any
liability for such changes. If you wish to confirm the origin or content of
this communication, please contact the sender by using an alternative means of
communication.
This communication does not create or modify any contract and, unless otherwise
stated, is not intended to be contractually binding.
Abbey National Treasury Services plc. Registered Office: Abbey National House,
2 Triton Square, Regents Place, London NW1 3AN. Registered in England under
Company Registration Number: 2338548. Regulated by the Financial Services
Authority (FSA).
***************************************************************************
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list