Re: 7/8-bit conversion vs. bouncing

 (i) Every smtp client that supports 8-bit transport MUST be able to 
support 7-bit transport and SHOULD be able to do 8->7 translation.  Not 
being able to do so is not nonconforming, but is really stupid.


you're right, i have a problem with this.  every 8-bit transport MUST be
able to 8->7 translation.


I also think they should be able to do 8 to 8 bit conversions - when
different 8 bit character sets are employed.


the ideal i'm striving for is that we add the same functionality everywhere.
8-bit transports are being designed here in a time when we know all about the
7-bit transports, so taking the 7-bitters into account seems pretty cheap.


Agree.

i'm also striving for an incrementally valid solution which will have zero
cost in network bandwidth when the whole network is someday converted to
8-bit transports.  granted we have a permanent design complexity such that
even when the whole network has been converted, new 8-bit transport implem-
entations will have to be able to deal with nonexistent 7-bit transports
in order to be "compliant".  that's not cheap but i'm proposing it anyway.


Well, that is inluded in the design for the mnemonic
(aka "quoted-readable" in the draft) specification, for which I have
written a draft RFC. And it is already implemented and available
and an implementation has been tested with about 1 million production
messages going thru it. It was not cheap, but it is done.

 The only thing that is wrong with doing it [vixie's] way is that we have 
moved beyond simple character mail and into very complicated stuff in 
which it may be very hard to guarantee "straightforward 1-to-1 
mappings".


"very hard" is not relevant.  you can represent any data structure in a
7-bit-wide data stream.  look at C-language source code for examples.
look at uuencode and atob for more examples.  it can be done and done
reliably.  (such a design would include an application-level checksum
for all the reasons clark et al have outlined in their end-to-end paper.)


And the stuff is designed, specified, implemented and distributed.
Some may call it "very hard", I think the idea was simple.

i would be willing to punt close-readability, though we could optimize
for the trivial (and common) case of 8-bit single-part text that just 
needs the 8th bit for accents and other non-ascii symbols.  anything
else could just be bitblasted into atob or uuencode or whatever structure-
dependent format people are currently considering.  (i will design this
part if noone else has done it.)


The mnemonic specification is close-readability.
My current work covers almost all of the ECMA registered character sets,
and a lot of vendor defined character sets, about 120 character sets
and 1310 different characters in all. It can convert between all these
character sets totally without information loss.

[...], the odds of ending up with a no-loss 1-1 mapping are much
higher than if that converter has to read, parse, and understand a
complex body part structure.


either we design it correctly or we don't.  either the design is implemented
correctly or it isn't.  i don't understand "hard" as a design-criteria here.
anything that is easier than X.mumblefrotz is not "hard" and is still a win.


And there are public implementations available right now of such
schemes.

NORDUnet NETF and EUnet has defined the menmonic conversion as their
preferred conversion specification. I would ask IETF to do the same.

Keld