xsl-list
[Top] [All Lists]

RE: [OT] charset (was: how to get an NCR in the output?)

2003-01-05 10:26:13
Tobias,

I think the RFC text you quoted supported what I've said: if there is no
charset parameter for "text/html", it defaults to ISO-8859-1 (no matter what
the entity body says). Basically this means that you either have to serve
ISO-8859-1 encoded content, or must set the charset parameter properly.

No, I don't like this as well (and I think the standards bodies agree in
retrospective). But this is what it currently says.

Julian

--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com]On Behalf Of 
Tobias Reif
Sent: Sunday, January 05, 2003 3:18 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] [OT] charset (was: how to get an NCR in the output?)


Julian Reschke wrote:

 > Tobias,
 >
 > AFAIK, the default for content type "text/html" *is* ISO-8859-1.

I don't think that it's as simple as that: one IETF spec says "The
default character set, which must be assumed in the absence of a charset
parameter, is US-ASCII."

As I said, I'm sending XHTML as text/html (since it's "HTML compatible").
In this case, the IETF says the following about the charset parameter:

http://ietf.org/rfc/rfc2854.txt
The 'text/html' Media Type
"
  charset
          The optional parameter "charset" refers to the character
          encoding used to represent the HTML document as a sequence of
          bytes. Any registered IANA charset may be used, but UTF-8 is
          preferred.  Although this parameter is optional, it is strongly
          recommended that it always be present. See Section 6 below for
          a discussion of charset default rules.
[...]
6. Charset default rules

    The use of an explicit charset parameter is strongly recommended.
    While [MIME] specifies "The default character set, which must be
    assumed in the absence of a charset parameter, is US-ASCII."  [HTTP]
    Section 3.7.1, defines that "media subtypes of the 'text' type are
    defined to have a default charset value of 'ISO-8859-1'".  Section
    19.3 of [HTTP] gives additional guidelines.  Using an explicit
    charset parameter will help avoid confusion.

    Using an explicit charset parameter also takes into account that the
    overwhelming majority of deployed browsers are set to use something
    else than 'ISO-8859-1' as the default; the actual default is either a
    corporate character encoding or character encodings widely deployed
    in a certain national or regional community. For further
    considerations, please also see Section 5.2 of [HTML40].
"

Personally, for XML sent as XML (eg SVG or XHTML), I think I'd prefer
that the XML prolog would always overrule the charset param if present,
and that the charset param would never be required, but the encoding=""
in the XML prolog.

Tobi

--

Vim users               donate.
http://iccf-holland.org/donate.html

Web developers           check.
http://www.pinkjuice.com/check/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list