ietf-822
[Top] [All Lists]

Re: printable multibyte encodings

1992-12-19 17:07:05
Does [UTF-2] encode ISO 8859-1 characters as themselves?

No; it encodes ASCII characters as themselves, but the high-page characters
in 8859-1 would get two-byte encodings.

What we need is an encoding that is "file-system safe" and represents the
printable characters in ISO 8859-1 and a few of the most common control
characters as them selves (one octet apiece)...

Unfortunately, such an encoding would be fairly inefficient for texts using
significant numbers of non-8859-1 characters, because it leaves relatively
few codes available for encoding them.  If there is to be *one* encoding
for multipurpose use, it has to avoid paying major penalties like that.

I can also see political arguments against such an encoding.  You'd surely
end up with not one encoding, but a family of them, one for each 8859-x
character set, because everybody would want their own set of printable
characters represented in one byte.  8859-1 is not the only 8-bit code
in wide use.  If the objective is to have one character set and one
standard encoding, it can't be too blatantly biased.  (Of course, it's
easy for an English-speaking North American to say this, since it won't
hurt *me*... but I think there's a valid argument there all the same.)

<Prev in Thread] Current Thread [Next in Thread>