Re: PLEASE READ -- Open issues list for RFC-XXXX

On Sun, 10 Nov 91 16:26:15 PST, Neil Katin wrote:

I really have to strongly disagree with Greg's position.  2022, 10646, etc
really *have* to be treated as character sets and not content types.  If
they are content types, then you cannot have any "structured documents"
(or any other content type) that uses 2022 the way ASCII is used now.


Structured document types that use Japanese 2022 (as opposed to general 2022)
will, of necessity, have to specify their own mechanisms for Japanese 2022 or
be a separate type.

For example, JTeX is *not* to be confused with TeX.

Furthermore, there's a much bigger can of worms that can be opened if you
start getting into Japanese characters using Shift-JIS (an 8-bit code) which
is often the preferred form for files on Unix.  Tools such as nemacs
dynamically switch between JIS using escape codes and Shift-JIS.

The best thing for this group to do at this stage is to punt; other than
creating a content type for `plain text which may have Japanese ISO-2022' any
more specific definition should be left for people in Japan to define as
needed.  Should they decide that TEXT/ISO-2022-JP is not suitable for their
long-term needs -- and undoubtably it will be -- they can change it.

However, it is an acceptable interim measure (at least those Japanese I
questioned seem to think so) to be able to say: "All you have to do is insert
        Content-Type: TEXT/ISO-2022-JP
in your message headers and you'll be conforming with the new Internet
standard."  I strongly believe that we should do no more than this.

How, for example, would you send something of type "Makefile" that uses
japanese characters encoded using 2022?  The character set(s) and encodings
need to be treated as an orthogonal issue to content "type" if you want
the system to be useful in non-US countries.


A Makefile in Japan either does not have Japanese characters, or is encoded
using Shift-JIS.  This is 1000% (to quote George McGovern) a non-issue at this
stage in RFC-XXXX.

On the same theme, why treat well specified 2022 character sets any
differently than iso-8859-1?


Once again.  ISO-2022 as used in Japan is *not* a character set.  In one of
the mail programs I use, *ALL* of the messages which come from it are Japanese
ISO-2022, ==> EVEN IF THERE IS NOT A SINGLE BIT OF JAPANESE IN IT <==.  When
this mail program on the NeXT is converted to support Japanese it too will
send out all its messages as ISO-2022-JP.

An ISO-2022-JP indication does not mean it is in a character set that is
unreadable on your terminal.  It means that it *MIGHT* have shifts into a
character set which may be unreadable on your terminal.  Or, it may be 100%
ASCII.  It only means that the guy who uses the program uses a mailer that is
Japanese-capable.

No, I do not want to check every bit of plain text I send to see if there is
some Japanese embedded in it and change the content type to TEXT/PLAIN.  For
my code, TEXT/ISO-2022-JP (or whatever is adopted) *is* TEXT/PLAIN.  I won't
send Japanese text to people who can't read it, but I am not going to bend
over backwards to coddle xenophobic mail software either.

Remember, it is one thing to say `add a header line as a constant string'.  It
is quite another to require a full implementation of RFC-XXXX in every program
just to send a simple piece of textual mail.