ietf-822
[Top] [All Lists]

Re: character sets

1991-04-29 22:54:27
There are ugly workarounds, but my guess is
that if 10646 is annointed the Internet Mail character set, the number
of loss-ful translations will be relativly miniscule.

Losing info is OK?


Internally, I can keep
the mail in x-unicode, or x-prime-character-set, and when it is
released to the net, it gets converted into the cannonical Internet
character set and x-unicode gets replaced with "text".

OK. Then the next question would be: What *is* the canonical Internet
character set? (The following are in no particular order.)

1. 2022, simplified. I.e. something like the Japanese usage. Messages
   start in Latin-1, with ASCII only in the headers. ESC sequence to
   switch to other character sets.

2. Unicode. Messages start in Latin-1, with headers in ASCII, some sort
   of escape sequence to switch to Unicode.

3. 10646, compaction method 5, code extension level 1. Messages start in
   Latin-1, with ASCII only in the headers. HOP to switch to other
   character subsets.

I prefer no. 1, the simplified ISO 2022. Why? Because it is compatible
with the current ASCII, Latin-1 and Japanese messages, i.e. the
majority of the world's RFC822 messages. Messages without Content-Type
headers comply with the new RFC if the default Content-Type is "text",
and the default character encoding is the simplified 2022. Messages
without Content-Encoding headers comply with the new RFC because the
default is NO encoding.

Some people may react to this, saying that 2022 is not really
compatible with ASCII, and that RFC822-compliant programs are allowed
to interpret the data as strict ASCII. My answer to this is that
problems caused by this are sufficiently infrequent (near nil) that we
may ignore this.

There may also be the complaint that ISO 2022 allows for a non-fixed
number of character sets, since new sets are registered with ISO every
now and then. My answer to this is that we specify in the RFC which
character sets are allowed. A later RFC may update this.

Comments?


Regards,
Erik


<Prev in Thread] Current Thread [Next in Thread>