xsl-list
[Top] [All Lists]

Re: how to get an NCR in the output?

2003-01-05 10:55:26
Joerg Pietschmann wrote:
The problem is that an encoding is declared in multiple places:
- HTTP headers
- XML declaration
- HTML META header.
It seems browsers take the HTTP header generated by the web
server as the authoritative declaration in case of conflicts.

...as they are required to. Same goes for XML parsers.

  http://www.w3.org/TR/html401/charset.html#h-5.2.2
  http://www.w3.org/TR/REC-xml#sec-guessing-with-ext-info

Serving XHTML as text/html may add to the confusion. When you do that, most
browsers (especially since very few have true support for XHTML) are not aware
that they are processing XML, so they go into HTML tag soup processing mode
and see the XML declaration as just junk. That's why I suggested the META.

It is not outside the realm of possibility for an XSLT processor to give you
the option of emitting NCRs even when the output encoding allows those
characters to be encoded directly. Most don't offer such functionality,
however. Saxon does (saxon:character-representation="decimal" in xsl:output).

You're also free to output us-ascii, if your processor supports it, and then
rewrite the encoding declaration afterward (or override it out-of-band, e.g.  
in the HTTP Content-Type header) to say utf-8 or some other us-ascii superset.
XML does have a clause saying that it's an error to misdeclare the encoding,
but it goes on to say that ASCII entities don't strictly need a declaration 
since the UTF-8 assumption won't cause problems.

Mike

-- 
  Mike J. Brown   |  http://skew.org/~mike/resume/
  Denver, CO, USA |  http://skew.org/xml/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list