------- Blind-Carbon-Copy
Erik, while I personally prefer this approach for today's interchange,
I do not support this as an internal processing encoding. It is only valid
as a Mail interchange mechanism.
Also, as a goal, I think that we should try to standardize 8 bit mail
rather then adjusting the standard for your conclusion that 7 bit is here
forever.
Frank,
---------------------------------- more comments -----------------------------
Multilingual Character Encoding for Internet Messages
Erik M. van der Poel, SRA
January 31, 1992
* Abstract
This document describes a multilingual character encoding for use in
Internet messages. This encoding is designed to be highly compatible
with existing electronic mail and network news handling software.
Erik, I have not gone thru this in details, but I wanted to just let
you know that the latest release of AIX (3.2) will provide this capability
via the converters. Effectively we provided a conversion from any
code set supported in a locale to a canonical format which we
named "fold7". As such, "ISO8859-1" to "fold7" will convert the Latin-1
characters into a 7 bit encoding using ISO 2022 escape sequences. For
both SJIS and eucJP, we follow the JUNET convention.
This is exactly what I've been proposing within the OSF SIG; for OSF to
provide in their mail sevice. I.e., modify either the sendmail or the
mailer themselves to use iconv to do this type of conversion. This
will solve the interoperability of mail between systems with different
code sets internally.
We don't do mnemonics as described in your paper.
*** Full ISO 2022
ISO 2022 has mechanisms for encoding text in either 7 or 8 bits.
Taking the 7-bit subset, then, may seem to be a feasible approach, but
ISO 2022 has very many different ways of encoding the same
information. This is rather complex and therefore likely to be
implemented wrongly. So full ISO 2022 is rejected.
Not when you combine it with the Compound Text rules. The problem with
CT is that is is restricted to graphic characters only. Control characters
can not be encoded using CT. As such, "fold7" uses the rules of CT but
is not limited to graphic characters.
*** Compound Text
Compound Text [CTEXT] is an MIT X Consortium standard intended to be
used in inter-application communications. It uses a subset of ISO
2022, and is therefore relatively simple, but it sets the 8th bit.
Not neccessarily. It does allow either 7 or 8 bits.
I think you are thinking about implementations of CT only use 8 bits.
Yet, the spec does allow 7 bit.
* Conformance
Implementations that are claimed to conform to this standard need not
be able to display all of the character sets specified above, but they
must be able to parse this multilingual encoding to the extent of
being able to discriminate between character sets that the
implementation can and cannot display. That is, all displayable
portions must be displayed. Non-displayable portions should be "shown"
to the user in some fashion, unspecified by this standard. (One
possibility is to simply say e.g. "Undisplayable Greek appeared
here".)
Displaying of the characters should not be considered in this standard.
It should just address the interchange of characters.
* Appendix - Processing Code
Strictly speaking, this appendix is not a part of this standard.
As a Mail interchange proposal this should definitely be removed.
Frank Rojas