John and Neil -
I think you're both right.
Here is how I see the vector of encodings in the various worlds:
Encoding 7-bit World 8-bit World 8->7 translation
8BIT impossible OK QUOTED-PRINTABLE or BASE64
7BIT OK OK none needed
BINARY impossible OK BASE64
BASE64 OK OK none needed
QUOTED-P... OK OK none needed
For the purposes of this discussion, I will postulate that encodings on
types MESSAGE and MULTIPART are banned, hence there is no nested encoding.
Any gateway which presumes to be able to pass on an 8-bit message to the 7-bit
world without destroying content has to be aware somewhat of what to do.
In three of the five cases no work is needed. In a fourth case the
translation is obvious and unambiguous. The question is what to do in the
last case, the case of 8-bit text.
It is there that the type of the data, and especially its character set,
becomes very important. QUOTED-PRINTABLE is very US and Euro-centric in its
thinking. QUOTED-PRINTABLE is not only useless, it is worse than useless for
Shift-JIS (Japan), Shift-GB (Red China), and BIG5 (Taiwan). JIS and GB are 14
bit character sets; BIG5 is 15 bits.
I think that our friends in Japan would prefer that an 8/7 translation
involve translating Shift-JIS into JIS/ISO-2022 (or whatever the ESC shift
code standard is called). I believe that the folks in Peking would have
similar preferences, although they really aren't much of a concern to us right
now.
However, just about the best thing you can do for the people in Taiwan is
to convert to BASE64. If you don't need ASCII capability, you can use BASE64
to encode 2 15-bit characters in 5 7-bit characters instead of the normal 3 8-
bit characters in 4 7-bit characters for a slight bandwidth improvement. I
think the people in Taiwan should ought to be consulted on this question.
A bottom line is that a dumb conversion to BASE64 for BINARY and 8BIT is
possible and safe, but that you need more awareness of what is there (which
implies type knowledge) to decide if you can do something smarter with quoted
printable or some other encoding.
-- Mark --