Dave writes, in part...
The status of 8859 is not clear, from the References section of
RFC XXXX.
If this is so, then the References section needs to be improved. The
"status of 8859" is as follows. "ISO 8859" describes a family of roughly
a dozen International Standards which use a common structuring model and
construction rules. The specific Standards are known as, e.g., ISO
8859-1, ISO 8859-2, etc.
8859-1 (Latin alphabet 1), 8859-2 (Latin alphabet 2), 8859-6 (Latin/
Arabic), and 8859-7 (Latin/Greek) became ISO International Standards in
1987. Other members of the family came later, with the most recent one
being published (that is, after the Standard became final) within the
last six months. There is, as my earlier note suggested, extensive
experience with 8859-1, including hardware implementations in very
large-volume products. To pretend otherwise is, IMHO, not a sign of
care and conservatism about standardizing the untried but either
evidence of lack of willingness to review existing experience or an odd
variation of the "not invented here" model of which we so often
criticize ISO and CCITT.
Yes, there are other alternatives within the ISO arena, some of
which have never been mentioned in these discussions. And I've been one
of the strongest advocates of avoiding tying ourselves to untried and
incomplete proposals, and will continue to be. But, unless people are
willing to take the position that we don't need non-ASCII character sets
until 10646 is finally approved, it seems to me to be totally
inappropriate to ignore the 8859 experience and usage.
And I think similar arguments could be made for what we have come to
refer to as 2022-jp. Nothing experimental about that either. Our
problems with it are due only to the fact that, if there are official
definitions, they are probably written in Japanese and in Kanji. We
have not figured out how to publish RFCs in that language and character
set, or how to effectively reference documents that the majority of IETF
participants cannot read. That is a problem, but it doesn't seem to me
to be an RFC-XXXX problem, or a problem of a set of character set
conventions that are experimental or not well-defined.
Appendix F, in that regard, it an attempt at an interpretative
translation of the real definition. Let me suggest a different way to
handle it, not as a serious proposal at the moment but as something
people should think about as a means of clarifying the issue here
(especially in the context of beliefs and desires about a truely
international Internet Society and IETF): Assuming that copyright
regulations, etc., permit, an informational document should be submitted
immediately to the RFC editor for publication that contains the "real"
specification of what we describe as 2022-jp but which, if I understand
things correctly, is actually a Japanese (JIS) National Standard.
Presumably that document is in Kanji and presumably the RFC editor can
figure out some way to cope with that. As part of the coping process,
appendix F should be removed from RFC-XXXX and attached to the proposed
informational RFC as an informal guide to the specification for those
who find reading technical Japanese excessively challenging.
Now I suggest that type of approach would make things procedurally
cleaner (independent of causing Jon a lot of aggravation), but that it
really doesn't change anything: what we call 2022-jp is in use, has been
tested for interoperability on a variety of platforms, etc.
We really can separate the well-established and proven from the
speculations, and we can do so without much trouble. Saying "character
sets are a mess, let's stick to ASCII" is unnecessary and really not
much better than saying "communication is much easier if everyone uses
English".
john