ietf-822
[Top] [All Lists]

Re: restrictions when defining charsets

1993-02-05 14:07:24
Steve Summit writes:

For two immediate examples, I would still think that "the
interpretation of each octet cannot be questioned" was intended
to rule out things like ISO-2022-JP, except for having read
suggestions on this list that it's really supposed to avoid
things like undifferentiated ISO-646.  I still have no idea what
"the number of representable characters is limited" is supposed
to accomplish.  (It probably rules out mnemonic encoding schemes,
but for what seem to me to be the wrong reasons.)

To rule out iso-2022-jp would indeed be a most unfortunate move,
as this is the predominant encoding in Japan for email.

I agree national ISO646 versions should be properly labelled,
(as also the different parts of 8859 should be labelled)

mnemonic encodings  as per RFC1345 have a limited set of characters
- some 24.000 but they are limited!

"the number of representable characters is limited" I believe
is to say that the combining sequences of Unicode cannot be used
to generate characters.
Anyway this is subtle wrt ISO 10646, as the repertoire of 10646 is
fixed: you cannot use combining characters to generate new characters.
You can use them to generate combining sequences, but that is not
characters! A matter of definition.

I believe the sentence on "not a set of characters" alludes
to the ISO terminology, where a "character set" is a repertoire
(without applicable encoding). This may be explained
further for people not aquainted to ISO character terminology.

Keld