Michael Kay wrote:
I don't know how good Java is at getting the encoding right, for example
whether it will use a different encoding if you use configuration options
such as "cmd /u" identified by Abel. I'll do some experiments.
Java will choose the default encoding of the underlying system, which
is, in the case of Windows, the codepage set in International and
Regional settings. This codepage is never compatible with IBM-437 (or
CP437) used for the command window, which is age old (1981). When the
Regional settings are set to US or some Western European country, the
codepage will default to CP1252 (windows-1252) (which is, like I said,
incompatible with the codepage for the console, giving the weird
characters in the U+0127+ range).
It is very awkward that Microsoft never chose to upgrade the default
codepage of the DOS console to be the same as Windows, but you can set
your default settings in the registry or in some system *.cmd file (I
forgot the name) (but then again, you can't set it to default to
whatever is in your Regional Settings...)
In Saxon, xsl:message by default uses a Java Writer, whereas "normal" result
documents use a Java OutputStream.
I'd like to argue in favor of defaulting to a particular encoding
instead (i.e., UTF-8), because now it's like a lottery how the
underlying system will determine what codepage it becomes (and build
once run everywhere does not mean 'run everywhere and act equally'
anymore, which I consider a pity). But such a discussion would be better
suited on the Saxon list I believe.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--