[Top] [All Lists]

Re: Content types and encodings

1991-08-28 16:25:36
John and Neil -

     I think you're both right.

     Here is how I see the vector of encodings in the various worlds:

Encoding        7-bit World     8-bit World     8->7 translation
 8BIT           impossible      OK              QUOTED-PRINTABLE or BASE64
 7BIT           OK              OK              none needed
 BINARY         impossible      OK              BASE64
 BASE64         OK              OK              none needed
 QUOTED-P...    OK              OK              none needed

     For the purposes of this discussion, I will postulate that encodings on
types MESSAGE and MULTIPART are banned, hence there is no nested encoding.
Any gateway which presumes to be able to pass on an 8-bit message to the 7-bit
world without destroying content has to be aware somewhat of what to do.

     In three of the five cases no work is needed.  In a fourth case the
translation is obvious and unambiguous.  The question is what to do in the
last case, the case of 8-bit text.

     It is there that the type of the data, and especially its character set,
becomes very important.  QUOTED-PRINTABLE is very US and Euro-centric in its
thinking.  QUOTED-PRINTABLE is not only useless, it is worse than useless for
Shift-JIS (Japan), Shift-GB (Red China), and BIG5 (Taiwan).  JIS and GB are 14
bit character sets; BIG5 is 15 bits.

     I think that our friends in Japan would prefer that an 8/7 translation
involve translating Shift-JIS into JIS/ISO-2022 (or whatever the ESC shift
code standard is called).  I believe that the folks in Peking would have
similar preferences, although they really aren't much of a concern to us right

     However, just about the best thing you can do for the people in Taiwan is
to convert to BASE64.  If you don't need ASCII capability, you can use BASE64
to encode 2 15-bit characters in 5 7-bit characters instead of the normal 3 8-
bit characters in 4 7-bit characters for a slight bandwidth improvement.  I
think the people in Taiwan should ought to be consulted on this question.

     A bottom line is that a dumb conversion to BASE64 for BINARY and 8BIT is
possible and safe, but that you need more awareness of what is there (which
implies type knowledge) to decide if you can do something smarter with quoted
printable or some other encoding.

-- Mark --

<Prev in Thread] Current Thread [Next in Thread>