ietf-822
[Top] [All Lists]

Re: closure & character sets

1991-12-30 16:44:32
  I consider the inclusion of label values specifying each of the
ISO-8859-X character sets (for each value of X defined by ISO) to be
essential for interoperability of multilingual messages.  These
standards are widely implemented (DEC & HP have been shipping ISO-8859
terminals for years now and other vendors also are supporting many of
the ISO-8859 family).  If we don't formally specify ISO-8859,
interoperability will suffer needlessly.  This is much less
experimental than most of the other material in RFC-XXXX and should be
included.  Note that I would be very unhappy if we included formal
specifications for the ISO-646-N family of character sets because they
have been superceded by ISO-8859 and the inclusion of ISO-646-N labels
would encourage their use in lieu of ISO-8859-X and would reduce
interoperability.

I am in complete agreement with all this.

  Mark & Erik have made it quite clear that there is in fact a lot of
implementation experience with the 'iso-2022-jp' scheme.  It is clear
to me and others who are interested in CJK computing that some form of
CJK support is essential, so I support the notion of defining the
'iso-2022-jp' label and providing a reference to the JUNET document
that Erik cited in an earlier message and indicating that a future RFC
will try to provide the essential implementation details in English.
This is to say that the label should be defined, but that
implementation of 'iso-2022-jp' (i.e. being able to display the glyphs
or transliterate them into some alphabetic representation if
necessary) should NOT be made mandatory for conformance to RFC-XXXX.

I don't see how we could possibly make it mandatory to be able to display
the glyphs from iso-2022-jp. I also would resist the addition of any
such requirement. It is also not required that the glyphs in 8859-n be
displayable.

  I agree that it is premature to standardise ISO-10646 and its
representation.  It is however highly desirable to continue to include
"rationale" text in the RFC indicating that future support for a
universal character set such as ISO-10646 is highly desirable given
the multilingual and multinational nature of the current Internet.  It
is also wise to continue to include "rationale" text strongly
discouraging the adoption of other private character sets.  The
proliferation of many character sets in Internet mail would impede
interoperability.  Specification of the use of ISO-10646 in Internet
Email should be deferred to another RFC and should not be undertaken
until after ISO-10646 is finally approved by the ISO.

I agree with this, provided that the IAB will let us get away with such
wording. My current understanding is that they will not, but I have not
heard this definitively.

Given the choice between no MIME and MIME without indirect references to
10646, I'll take the latter. We can always reserve the name with IANA and
write a separate informational RFC to describe the planned direction we want
to take.

  While Keld and I have had our disagreements, I respect his basic
command of the facts and think that RFC-CHAR (as it is called) is
reasonably complete.  For example, it appears to fully support
Vietnamese (unlike ISO 1st DIS 10646) for example.  There are problems
in the handling of CJ ideograms (e.g.: differing pronunciations of the
same character and the very high incidence of homonyms make phonetic
representation difficult or impossible; a huge lookup table would be
required to implement RFC-CHAR on a system using anything other than
ISO 2DIS 10646 if the CJK ideograms are supported), but those appear
to be inherent in any alphabetic encoding of such ideograms.

Agree.

It isn't
clear to me that RFC-CHAR should be made an Internet Standard, but it
clearly should be published as it is very useful within at least
European languages and perhaps for all alphabetic languages.

It is not clear to me either. Apart from everything else, RFC-CHAR references
10646 directly and it is definitely true that this is not acceptable in a
document that plans to go standard-track right away.

However, I have a more fundamental problem, which is that RFC-CHAR is not
really a standard! It is a fantastic tabulation of various character sets plus 
a way to represent the characters in all of them using a common neutral
format that's representible in virtually any variant of ASCII or EBCDIC.

RFC-CHAR doesn't say a single word about how any of this data is supposed to
be used. As such, it isn't too bad for people like me, who have specific
goals in mind, but does it mean to have it as a standard? What constitutes
conformance to it? And when you get right down to it, what does it mean
to use RFC-CHAR to convert from one character set to another? (Anyone who
thinks this is obvious from the provided tables ought to try implementing
a converter. The things you have to do are far from obvious.)

To my mind RFC-CHAR is more like the lists of various Internet things that
are published as RFCs periodically. How such a thing is handled as a standard
is not at all clear, and wiser heads than mine need to deal withi this issue,
and soon. I for one need this functionality and don't want to lose its
capabilities, regardless of whether or not they are formalized.

SUMMARY:

  Omission of ISO-8859-N support would be a ship-stopper.

Strongly agree.

  Prohibition of ISO-2022-JP support would be a ship-stopper.

Agreed.

  RFC-CHAR should be published at least informationally.

Strongly agree.

  ISO-10646 should be mentioned in rationale text with 
    specification and non-experimental use deferred to an RFC
    written after it becomes final.

If it is possible to do this I agree. If it is not possible I strongly
recommend moving this text to a separate informational RFC ASAP.

  Use of all other character sets should be explicitly discouraged.

Agreed.

  The namespace other than beginning with "X-" should be reserved
    for future use by Internet Standards & RFCs.

I don't have strong feeling about this stuff in any case. Others might,
however.

Wow, it must be the season -- Ran and I seem to agree on every point ;-)

                                        Ned




<Prev in Thread] Current Thread [Next in Thread>