On May 9, 11:26pm, Einar Stefferud wrote:
Subject: Re: EBCDIC & uuencode/uudecode
% To me, our objective is to make our mail systems reach out as far as
% possible without making trouble for anyone. This means that we must
% not take a provincial postion with regard to "INTERNET USASCII IS ALL
% WE CARE ABOUT, AND TO HELL WITH THE REST OF THE WORLD!"
I think that there is clear consensus that US ASCII is not sufficient
for text mail messages. It is however supported by the current
installed base of RFC 822 compliant sites. This makes it a clear
common starting point. It is equally important to not take the
provincial position that "Western languages are all we care about and
to **** with the rest of the world."
% It seems clear to me, and I suggest that we conclude that we have
% consensus on this, that we need to use a base64 encoding that will
% pass without any translation ambiguity trouble through as many limited
% and troublesome character encodings as possible.
I agree with the concept that if one is going to devise such encoding
algorithms that it should try to be as portable as is practical. I
would disagree with the idea that existing uuencode/uudecode should
be changed in such a manner.
% The list of "charcter set environments" what we want to pass unscathed
% includes (USACII, other non-USASCII 7bit codes, EBCDIC, 8859/n, and
% Printable String). Are there any others? Lets agree on this list!
I think that there are two or three questions here that we need to keep
clearly separated:
1) which character set encodings should be supported/required
for text mail traffic ?
2) should a replacement for uuencode/uudecode be devised and
included in the replacement for RFC 822 ?
3) if the answer to 2 is yes, which glyphs should be used in
the new scheme ?
My views follow:
1) I am not persuaded that explicit support for text mail traffic
using "other non-USASCII 7bit codes" belongs in the RFC. Especially
since these (the ISO 646 Family) have been SUPERCEDED by 8-bit
character set standards (the ISO 8859 family). It would be very
simple for an local gateway to handle local 7bit <--> ISO 8859/n
conversion if the only local need were the 7bit character set (which
is implicit in the suggestion that the 7bit sets be part of the RFC).
With regard to text mail traffic, if the network uses a character set
that is a superset (ISO 8859 family) of the local implementation (ISO
646 family), then everything possible using the local implementation
is possible across the network. Already the ISO 8859/1 "ISO Latin-1"
character encoding is in widespread use in Western Europe, replacing
former use of several ISO 646 variants. The rate of this conversion
is still accelerating.
Similarly, I am opposed to the RFC requiring support for text
mail messages encoded in any dialect of EBCDIC. EBCDIC is more
restrictive than the ISO 8859 family and so across the network the
text mail traffic from EBCDIC sites can easily be encoded into either
ISO 8859/n or US ASCII both of which need to be supported in the RFC.
It is clear that if one is going to support non-western languages that
ISO 10646 support is necessary. I think that it is important to support
such languages.
2) I don't have strong feelings either way. I suspect that any scheme
that is devised will not be widely available for some time, on the
other hand it might be useful over the long term.
3) I agree that if a scheme is to be devised that the intersection
of the ISO invariant glyphs and the EBCDIC invariant glyphs is
a reasonable set to work with. Note the distinction between glyphs
and encodings here. How many such glyphs are there ?
Randall Atkinson
randall(_at_)Virginia(_dot_)EDU