MIME charset (not charset in general) and its properties

As the CJK disambiguation is necessary word-by-word (don't forget that
Harald proposes to handle multi-lingual document) and in header part,
and the disambiguation is necessary only for the specific character
set: ISO10646/UNICODE, language tag is not a good mechanism for the
disambiguation. It's better to use ISO10646/UNICODE with the
charset names "iso-10646-<language tag>" for single language only.

                                                 ^^^^^^^^^^^^^^^^^^^^

I fail to see why

   Content-Type: Text/Plain; charset=iso-10646-chinese

would solve the problem of word-by-word distinction between
Chinese and Japanese in a multi-lingual text any better than


No, of course. I wrote "single language only".

But the inaility is specifically to ISO 10646. It is not the inability
of other encoding systems such as full ISO 2022.

Also, ISO 8859 for Arabic/Hebrew has ambiguity on directionality, so that
additional information is necessary.

Thus, to use some encoding system need profiling.

The problem should, I think, be solved by registering charset name with
such profiling information.

   Content-Type: Text/Plain; charset=iso-10646
   Content-Language: zh (Chinese)

Neither of them does, I think, and the latter approach seems
cleaner to me, as it doesn't confuse language with coded
character set.


Don't confuse the language of the content and language of the script.

I can write Japanese with ASCII characters.

"Watashiha ASCII mojide nihongo wo kakemasu" is the Japanese translation
of the sentense above.

Thus, your suggestiton should have been:

   Content-Type: Text/Plain; charset=iso-10646
   Content-Script-Language: zh (Chinese)

or

   Content-Type: Text/Plain; charset=iso-10646 charset-language=zh


But, then, how can you encode content-script-language header in header?

                                                Masataka Ohta

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: A spec for showing language in MIME headers, Keith Moore

Next by Date:

Re: A spec for showing language in MIME headers, Masataka Ohta

Previous by Thread:

Re: A spec for showing language in MIME headers, Olle Jarnefors

Next by Thread:

Re: A spec for showing language in MIME headers, Olle Jarnefors

Indexes:

[Date] [Thread] [Top] [All Lists]