xsl-list
[Top] [All Lists]

RE: encoding of text files

2002-11-14 04:06:55
Just a throught: it may make sense to prefix the text file with a UTF-8 BOM
(as far as I remember, at least Notepad on Windows honors this).

--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com]On Behalf Of 
Yates, Danny
(ANTS)
Sent: Thursday, November 14, 2002 11:49 AM
To: 'xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com'
Subject: RE: [xsl] encoding of text files


Hi Joerg,

If you are outputting UTF-8 then your a-umlaut will be written as
a two-byte sequence. If your output is serialised XML or HTML then
this is fine, as there are headers which can declare that the
content is UTF-8 encoded. If, however, you are writing a plain text
file (as you say you are), there is no way for the process which
reads it in to determine whether it is UTF-8, ASCII, iso-8859-1 or
whatever.

The first string you give would appear to indicate that there are,
as expected, two bytes in the output stream where you expect your
a-umlaut character to appear, and the program you are using to
view this file doesn't understand this.

When you ask XSLT to output using iso-8859-1, it know that in this
encoding there is a single byte representation of a-umlaut, and it
uses this and it is correctly intpretted by your viewing program.

So, if you must write out UTF-8 (and it's quite possible that you
may be able to survive with iso-8859-1 if you're just using a few
simple accented characters, such as French and German), then you
need to tell your viewing program that the byte stream you are
feeding it is a UTF-8 encoded character stream.

Regards,

Dan.


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>