Re: MIME for VM/CMS

To:  moore(_at_)cs(_dot_)utk(_dot_)edu, 
ietf-822(_at_)dimacs(_dot_)rutgers(_dot_)edu
Subject:       Re: MIME for VM/CMS
Date:          Wed, 05 May 93 18:17:02 CDT

Believe it or not, I think charset=us-ascii is the right thing to do,
even if it seems wrong.  This is no different in principle than what
UNIX systems do...the UA hands the MTA a file in "local" format
(end-of-line = LF) and it gets translated into SMTP format (end-of-line
= CRLF) before it goes on the wire.


        In this sense,  I completely agree with you.   But it requires
that at least one MTA get involved.   If it's the first  sendmail,
fine,  that's local to the sender anyway.

        My concern is that intermediate MTAs (gateways) could get into
the game.   That I DON'T want.   They'll break.   They'll require more
maintenance.   They'll eat more (perhaps not much, but some) CPU time.
They'll likely whine more often where they don't now.


We agree completely on this.  Especially that the EBCDIC<-> gateways will
break things if they try to interpret MIME.  My input into MIME was with the
assumption that the EBCDIC<->ASCII wouldn't change....since there are too
many of them to update them all at once to "do the right thing".

A side note:  the quoted-printable encoding (along with base64) is very
carefully defined in terms of characters, not octet values, so that it can
work with EBCDIC as well as ASCII.  For instance, 'A' *always* means 41 
hex, even if the local charset is EBCDIC.


        But what if 0x41 isn't represented as  =41,  but as  A?
Then it's not 0x41 on my system,  it's 0xC1.   :-(


That's okay, you have to translate it to 0x41 (in the general case).  It's no
worse than base64, where 'A' means 0x00.

One result of this is that if you correctly apply the quoted-printable
content-transfer-encoding to a text object labelled charset=EBCDIC, the
resulting file will be full of gibberish...since nearly every value will 
have to be hex-encoded.

If, however, it's labelled charset=US-ASCII, it will look right if you 
just display the file on your screen (even on an EBCDIC system) and a MIME
mail reader will also do the right thing.  In the latter case it will
translate 'A' into 41 hex, which will then be translated from ASCII to
EBCDIC before displaying...an 'A'.


        Keith,  you're still not thinking about  "plain text"  on the
EBCDIC host.   I can handle QP now,  but I have to translate 0xC1 to
0x41  IF IT WASN'T QUOTED,  =41.   But if it was quoted,  I let it be.
The result is that I go through more translations than I should.



For text/plain; charset=us-ascii, you can do the following:

+ for 'bare' characters, just display them.
+ for quoted (=XX) characters, decode the hex value and look up the
  result in an ascii-to-ebcdic translate table, and display that.

For most other kinds of body parts you have to do the translation for every
displayed character.

For base64-encoded, text/plain; charset=us-ascii, body parts (which you must
support to be MIME compliant) you also have to do the translation for every
displayed character.

But there's nothing wrong with optimizing the text/plain, quoted-printable
case.

But this is a mainframe,  right?   So it's got the horsepower to
spare. (NOT!)


Well, MIME is designed for ASCII, so there is naturally some penalty on an
EBCDIC machine.  And I understand that big mainframes want to conserve cpu
cycles.  But I don't think the cost is prohibitive, especially if you
optimize the common cases.

Keith