Re: We are using ISO-2022-JP *NOW*!

On Thu, 1 Dec 1994, OKADA Takashi wrote:

    For many years, we are using localized(Japanese) versions of MUA,
and they assume plain text messages are in the ISO-2022-JP encoding.
If we assume plain text messages are in US-ASCII, we lose the inter-
operability with many existing MUA.  We have many colleages that know
NOTHING about the MIME standard, and we can't force all of them to
give up their MUAs.  They are posting many messages for us WITHOUT
Content-Type: or even WITHOUT MIME-Version:, NOW!


*sigh*  If a message does not have a MIME-Version line, it explicitly
does not conform to MIME and you can assume anything you want about it.
You can assume it is in iso-2022-jp, in iso-8859-1, in us-ascii, or anything
at all really.  It is a local matter.  But don't expect your messages to
"look right" when they go outside of your local area if they haven't been
labelled correctly.

If you do wish to use MIME conventions, then the correct way to label
iso-2022-jp messages is as:

        MIME-Version: 1.0
        Content-Type: text/plain; charset=iso-2022-jp

Even if you hate MIME with a passion and have no intention of writing a full
MIME MUA, it is still a good idea to add those two lines to outgoing messages.
Such messages will work fine with your old non-MIME software, and will
also work correctly with new MIME software.  If however that "charset"
label is left off, the MIME software MUST assume that the default is
US-ASCII.  The simple solution to this "problem" is: don't leave it off!!

In relation to charset, MIME is about labelling things so that when they
go outside of a local area, the software outside has some indication about
how to display it right.  The software may not have the right fonts, tools,
etc, but at least it knows what it is and can inform the user.

    We found nothing useful with charset.  For example, we often
handle messages of plain text without help of MUA.  Of course our
tools like less, grep, or Mule know nothing about MIME specification.
How do you tell them the MIME standard?


Such tools don't really care about the MIME standard because they aren't
MIME tools.  However, once your message gets injected into the network,
it would be nice if it was labelled correctly according to MIME.  No need
to change less, grep, Mule, etc.  Just change the message submission agent.
You will also need to change the MUA on the receiving end to convert back
into whatever less, grep, Mule, etc want, but that's not too difficult is it?

    Or, if the charset is required, how do we choose value for
charset?  If we found US-ASCII characters in the message, we will also
find ISO-8859-1, JIS X0208, or many other characters from many other
standard character sets.


You can choose a charset value in two ways: the dumb way or the smart way.
The dumb way is to label everything as ISO-2022-JP.  The smart way is to
analyse the text and if it contains only US-ASCII, you label it as
US-ASCII.  If it contains only ISO-8859-1, you label it as ISO-8859-1,
and if it contains JIS encodings, you label it as ISO-2022-JP.  Given
the speed of today's computers and the relative smallness of most e-mail
and news messages, this analysis phase is not a heavy burden.

Why is this so hard to understand?

Cheers,

Rhys.