Re: On encodings : random thoughts....

Keld J|rn Simonsen <keld(_at_)dkuug(_dot_)dk> writes


I will suggest (persistingly :-) that the quoted-readable
notation is more suitable to this job. The requirements on an
intelligent UA is not as high as for quoted-printable,
where you need to know the character set of the sender and how to
convert to the readers character set. This potentionally means that
you need to be able to convert from a myriad of character sets.
With quoted-readable you only need to know how to transform from
a well-defined notation (specified in the RFC) into the reader's
character set.


Keld's proposal is for an "encoding" between mailascii and a collection
of glyphs. The same glyphs might be represented by different octets or 
sequences of octets in different character set Content-types. So his 
quoted-readable encoding is actually a different algorithmic encoding
for each Content-type it is used with.

The nice thing about this is that: (a) if the recipient has nothing but
a dumb ascii terminal the message will still be quite readable; and (b)
if the recipient UA/mail-reader understands the glyphs but not the 
particular Content-type, it can convert the message to a locally
understood Content-type and know that the sequence of glyphs is
preserved.

I objected that users might be sending a message in which the glyphs
are actually unimportant and the sequence of octets is being used to
encode some binary information in a private way. Keld responded that
this is meant for use in the context of RFC-XXXX, and in that context
users should never do that. I might add that we are talking about
transformations that are taking place in the UA/mail-reader and thus
firmly under the users control so this should not be a problem.

The idea of a Content-encoding whose algorithmic meaning varies with
the Content-type takes a bit of getting used to, and is not compatible
with the ideal of making Content-type and Content-encoding orthogonal.
However it is the Europeans who have the problem of multiple Character
sets with the same glyph appearing in different ways in different sets.
They have a solution which can be accomodated into the rfc-xxxx framework
with a slight squeeze, and I believe it deserves careful consideration.

Bob Smart