Re: interoperablity

Keith Moore writes:

Keld, it's not clear that mnemonic is any friendlier to the installed base
(when taken as a whole) than either quoted-printable or ISO-2022-INT-*. 
Mnemonic works for plain text in certain languages (but maybe not so well for
technical text or programming language source code).  Ohta-san's approach
works okay for people who are displaying messages on terminals that support
ISO-2022 escape sequences.  Either of them will work if there is appropriate
decoding/display software on the recipient end.  Neither of them works well
for everyone, and neither of the schemes will work (without additional
encoding) for message headers.


I believe it is an engineering exercise. That is we should look
how well the different approaches work for the different cases.
The criteria is: what works best for the users? That is, CPU cycles
and programmer time is less important, at least in the order of
complexity we are talking about here.

I am arguing for using mnemonic, forgive me from taking that
point of view in the following...

in the case QP vs MNEM, I would say that for each QP encoding, 
I can find a better MNEM encoding, in the sense that a reader would
more easily understand it. QED.

In the case MNEM vs INT-2022, well for the equipment honoring
2022, I would believe INT-2022... works well, but then the
mnemonic encoding is not in conflict with ISO 2022 (I am not sure of
the technical aspects of Otha-sans spec). The INT-2022 has special
requirements (that the hardware/software understands 2022), which
mnemonic does not have. mnemonic is more basic, and in principle
it does not expect more than ASCII, while richer character sets
can be employed where available.

I do not see the problems with scientific texts nor program texts
with respect to MNEM. Examples?

I believe MNEM would work for everyone, in the area it is intended for,
text/plain and maybe some enhanced text environments. I do not
see that it is less versatile here than QP.

Message headers is another story, which needs additional work
to make it right.

Quoted-printable looked like a reasonable compromise when it was proposed,
but didn't look as good when it started cropping up everywhere.  I'm not sure
that either mnemonic or iso-2022-int-* would fare any better when subjected
to the same scrutiny.  Do your users like mnemnonic even if they were already
accustomed to seeing "the real thing" on their screens?  Didn't they
complain when their real characters went away?


Users want "the real thing". We did not make "the real thing"
go away, if we could still provide it. Our customer base is a mixed
world of UUCP and SMTP sites, some with PC lans behind it, and MS mail
etc out there. For the sites that we were MX-ing for, we recorded
their preferred charset and converted back-and-forth - other full
SMTP sites installed our software (IDA sendmail 5.67b).
There are other Internet email providers in Denmark, and we did not
provide this service to the others' customers, except when we got severe
complaints from our own customers. We have been using 7-bit Danish
ASCII mail for 11 years, so there was inherently some confusion
between danish and ASCII 7-bit codes. So I do not think users felt
their "real things" going away. They more experienced the change to
the 8-bit world and additional capabilities, which even might be working
"right". Compared to the old world, say 5 years ago, when Danish email
was all 7-bit, and no confusion could occur, you can say that we
are now in a more confusing situation with regards to our national
letters. Anyway with mnemonic software, the situation is IMHO not as bad
as in Sweden, where mnemonic is not so widespread, having very similar
requirements for their language, and similar email history.

The main complaints from customers have been 7/8 bit problems, stemming
from confusion on 7-bit characters both being used for national letters
and ASCII characters.


Or did your users have the luxury of having special software installed
(either at their MTA or UA) to handle mnemonic and translate it to their
local format?  If so, it's not a fair comparison with quoted-printable.


There was some software installed centrally at the MX-ing machines,
and some other places. We did our best to support our customers wrt
charset support. Still a considerable number of customers have
been exposed to mnemonic, and also non-customers....


I *like* mnemonic.  I would like to see it widely implemented.  If users
really like it enough better than bare MIME, it will become a de facto
standard, and everyone's mail user agent will support it.  (Is there a
freely-available mnemonic<->8859/* translator that could be plugged into
.mailcap files?  That would do more towards getting mnemonic to be accepted
than anything else...)


There are freely available translators, I have not tried to put
it into a .mailcap . I know I should work on it....

(By the same token, I think the ISO-2022-INT-* scheme is also clever, and
makes a good solution for a different set of people...but it isn't
sufficiently general to replace either quoted-printable encoding in message
bodies or 1522 encoded-words in headers.)


I am sure it works great in environments with 2022 capabilities.

But the nice thing about MIME is that it is extensible enough to allow
new character sets and new content-types to be defined...

You can think of bare MIME as a "worst case" solution.  It is somewhat ugly
and it is somewhat painful to migrate to it, but it is also very general and
it provides a smooth mechanism to migrate to future extensions.  I fully
expect that in a few years, we won't be using the same content-types that we
typically use today...by then, "text/plain" may be relegated to shipping
around program source files, we will use something completely different for
human-readable text that provides a combination of multilingual character set
support, appearance information like (richtext,enriched,simpletext) tries to
provide now, and maybe semantic information also.


Yes, things are evolving, but there will be a need for interoperability
for a long time down to RFC822 US-ASCII capability. And when we have
got rid of that, then there will be an interoperabilty problem
down to ISO-9959-1 and ISO-8859-2 and ISO..., even to UCS-2 (UNICODE)
may be considered unreasonably limited and ugly at some time in
the future. Mnemonic was designed to make a smooth migration with
respect to character representation thru this whole range of 
character sets, in an environment like Internet.

keld

My recommendation is thus that we, when doing enhanced character set
support for news, then use the mnemonics scheme as our basic exchange
plain text format. This is currently also possible in MIME, and we should
use it there too.


I would like to see mail and news be as similar as possible.  If you 
can get *users* to embrace mnemonic, it should fit in just fine with
both mail and news -- if both are based on MIME.

But you may find that it takes more time to get widespread acceptance
of mnemonic, than it does to deploy real MIME mail readers!

Keith