ietf-822
[Top] [All Lists]

Re: character sets

1991-04-29 00:30:15
Keld writes:
Well, could it not be allowed to display an approximation of the
character missing in the current charset, for example displaying
a c-cedille in ASCII as "c," and then marked as an approximisation?

Sure, I have nothing against such a readable encoding. For example, we
might add a new Content-Encoding called Quoted-Readable, and use & as
the escape character, not \ since \ is an ISO 646 variant character.
And then we drop the & from the Quoted-Printable encoding, i.e. use
only \.

But we should remember to keep the number of Content-Encodings small,
e.g. Quoted-Printable, Quoted-Readable and BASE85 *only*.

Quoted-Readable would be very good for Latin-1, but not for, say,
Japanese in 10646. For the latter, you might use Quoted-Printable if
there is a lot of interspersed ASCII, otherwise BASE85, since you
can't read it anyway.

Note that I'm talking about the encoding that is carried over the
Internet, not necessarily what is displayed for the user, though,
clearly, the Quoted-Readable encoding's main purpose is to be
displayable. If the Latin-1 people in Europe form a large enclave,
then they could install gateways that automatically convert in-coming
Quoted-Readable Latin-1 to straight Latin-1.


The missing characters could also be coded in the private use zones
of 10646. 10646 already have the mechanisms, why not use it?

If we use a 10646 private zone for the unconvertible characters, what
happens when the next version of 10646 comes out and it contains the
previously unconvertible characters?

I think I would prefer to keep the unconvertible characters in a
separate codespace i.e. their own codeset.


Erik


<Prev in Thread] Current Thread [Next in Thread>