[Top] [All Lists]

Re: Prohibition of EBCDIC in text/plain

1995-06-08 13:42:41
At 2:34 PM 6/8/95, Patrik Faltstrom wrote:
At 09.35 95-06-08, Harald(_dot_)T(_dot_)Alvestrand(_at_)uninett(_dot_)no wrote:
This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
character sets.

I assumed that what is called UCS-2 is the same thing as what
in "The Unicode Standard, Version 1.1, Appendix F", is called
FSS-UTF, i.e. Filesystem Safe UCS Transformation Format.

If this is true, I read the encoding rules in a way that all
characters 0x00 to 0x7F is encoded as themselves (as one-byte
characters) and that all other characters is encoded in
two, three, four, five or six byte characters. All the bytes
in the multibyte characters have their 8:th bit set.

By using this encoding, this is to me actually an encoding which
can be sent as a text/plain message, because a 'CR', 'LF'
and NULL are encoded as themselves and those bit-patterns does
not exist in the multibyte encoding of the other characters.

If I am wrong, please let me know.

No, UCS-2 is the 16 bit form of Unicode. FSS-UTF is now called UTF-8, and
it's an official annex to ISO 10646. You are correct, UTF-8 is compatible
with the MIME text/plain content type. So is UTF-7 (see RFC 1642). However,
straight Unicode (UCS-2), EBCDIC, and other character sets which do not
contain US-ASCII as a subset are not compatible, unfortunately.

David Goldsmith
Senior Scientist
Taligent, Inc.
10201 N. DeAnza Blvd.
Cupertino, CA  95014-2233