[Top] [All Lists]

Re: Charset mandatory in unix/linux

2006-03-13 02:09:53

At 00:10 06/03/13, Ned Freed wrote:
>(cc'ing the ietf-types list since this doesn't seem like an appropriate topic
>for ietf-822)
>> The charset parameter is mandatory in the MIME content-type
>> attribute.
>Actually, I don't know of a single case where this is true.

I don't, either.

>All media type
>pararameters are either type or subtype specific, so there is no general rule
>that applies to all charset parameters. Nevertheless, the charset parameters
>that attach to the text top-level type are optional, as is the charset
>parameter on application/xml. And making the parameter optional doesn't even
>imply that there's a default. For exmaple, In the case of XML the allowed
>charsets for unlabelled material are intentionally limited so they can be
>determined by inspection.

Almost correct, but wrong: In the case of application/xml, if there
is no 'charset' parameter on the mime type, information inside the
XML document is used to determine the character encoding according
to a clearly defined bootstrap algorithm. If you start a document
    <?xml version='1.0' encoding='foobar'?>
then it's in the "foobar" encoding. That doesn't mean that your
parser will be able to understand the "foobar" encoding, XML
parsers are only required to understand UTF-8 and UTF-16.

>And fifth, you don't seem to appreciate the difficulty of getting everyone to
>agree to actually use filesystem metadata to solve any of these problems. This
>last is a complete showstopper and I dispair of there ever being significant
>progress in this area because of it.

I fully agree with this at the present stage. But there is still
the hope that this problem will solve itself in 5 or 10 years,
like other, similar problem have been solved or are being solved
(ASCII for very basic text exchange, Unicode for worldwide text
exchange, XML for structured data exchange, and so on).

Regards, Martin.