[Top] [All Lists]


1993-05-05 16:33:08
Believe it or not, I think charset=us-ascii is the right thing to do,
even if it seems wrong.  This is no different in principle than what
UNIX systems do...the UA hands the MTA a file in "local" format
(end-of-line = LF) and it gets translated into SMTP format (end-of-line
= CRLF) before it goes on the wire.

        In this sense,  I completely agree with you.   But it requires
that at least one MTA get involved.   If it's the first  sendmail,
fine,  that's local to the sender anyway.

        My concern is that intermediate MTAs (gateways) could get into
the game.   That I DON'T want.   They'll break.   They'll require more
maintenance.   They'll eat more (perhaps not much, but some) CPU time.
They'll likely whine more often where they don't now.

Leaving charset off entirely is an interesting compromise, though...
you aren't mislabeling the information, but a MIME mail reader should
interpret a missing charset parameter as US-ASCII which is the right
thing to do.

        Yes,  I think so.   I've CC'ed the ietf-822 list on this.
Dunno yet if this thread has been woven into the fabric.   (I'm going
over the archives and there's a LOT of work been done;  lotso messages)

        The idea is to leave  "plain text"  as  even plainer than ASCII
so that non-ASCII hosted MUAs won't have to "lie" about CHARSET=US-ASCII.

A side note:  the quoted-printable encoding (along with base64) is very
carefully defined in terms of characters, not octet values, so that it can
work with EBCDIC as well as ASCII.  For instance, 'A' *always* means 41 hex,
even if the local charset is EBCDIC.

        But what if 0x41 isn't represented as  =41,  but as  A?
Then it's not 0x41 on my system,  it's 0xC1.   :-(    Perhaps there's
more in RFC1341 about this.   I'll look.

One result of this is that if you correctly apply the quoted-printable
content-transfer-encoding to a text object labelled charset=EBCDIC, the
resulting file will be full of gibberish...since nearly every value will have
to be hex-encoded.

If, however, it's labelled charset=US-ASCII, it will look right if you just
display the file on your screen (even on an EBCDIC system) and a MIME mail
reader will also do the right thing.  In the latter case it will translate
'A' into 41 hex, which will then be translated from ASCII to EBCDIC before 'A'.

        Keith,  you're still not thinking about  "plain text"  on the
EBCDIC host.   I can handle QP now,  but I have to translate 0xC1 to
0x41  IF IT WASN'T QUOTED,  =41.   But if it was quoted,  I let it be.
The result is that I go through more translations than I should.
But this is a mainframe,  right?   So it's got the horsepower to
spare.   (NOT!)     ;-)


        Otherwise,  everything's fine.   The spec is strong.
It's very processable.   And GIFs are lotso fun.

Rick Troth <troth(_at_)rice(_dot_)edu>,  Rice University,  Information Systems

<Prev in Thread] Current Thread [Next in Thread>