xsl-list
[Top] [All Lists]

Re: xml -> htmlhelp and character 8220

2004-11-12 14:59:03
Allin Cottrell wrote:

There isn't. HTML output is implicit in context, and up till now I have let the output encoding be implicit too (given that we're generating HTML help for Windoze). The input encoding is specified in the input xml files.

Output encoding for HTML Help couldn't be implicit as HTML Help is very buggy piece of software. It doesn't support UTF-8 or character entity references so all characters must be written raw in some single-byte encoding. For Western-European languages the most appropriate encoding is windows-1252 -- it contains both dashes and quotes. So in your case you should have following settings:

<xsl:param name="htmlhelp.encoding" select="'windows-1252'"/>
<xsl:param name="chunker.output.encoding" select="'windows-1252'"/>
<xsl:param name="saxon.character.representation" select="'native'"/>

In most cases documents can be in UTF-8, so you can change second line to:

<xsl:param name="chunker.output.encoding" select="'UTF-8'"/>

Thanks very much for your help. But I'm coming to the conclusion this is a bug (or at least a feature regression) in the xsl stylesheets. It seems that any required character re-encoding should be implicit from the context

  iso-8859-1 xml input -> Windows html help output

and should be handled correctly without the user having to specify up to 4 encoding variables. And this did happen OK with the earlier release of the stylesheets.

This is not possible, bacause it will be very hard to select proper encoding automatically (without having hardwired character repertoires for each encoding inside stylesheets). You must use different single-byte encodings for different languages depending on fancy characters appearing in titles that go to project files (.hhc, .hhk, .hhp).

You should blame MS for not supporting Unicode in HTML Help. But it is waste of time, because HTML Help was frozen long time ago, MS Help 2 is only for Visual Studio.NET and next help system will be available in Longhorn (which release data is constantly shifting).

                                Jirka
        (author of HTML Help output in DocBook stylesheets)

--
------------------------------------------------------------------
  Jirka Kosek     e-mail: jirka(_at_)kosek(_dot_)cz     http://www.kosek.cz
------------------------------------------------------------------
  Profesionální školení a poradenství v oblasti technologií XML.
     Podívejte se na náš nově spuštěný web http://DocBook.cz
       Podrobný přehled školení http://xmlguru.cz/skoleni/
------------------------------------------------------------------

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature