ietf-822
[Top] [All Lists]

Re: Content-Transfer-Encoding and yEnc

2002-04-10 12:24:00

ned <ned+ietf-822(_at_)mrochek(_dot_)com> writes:

I suggest that these be a gzip-8bit (the difference in overhead between
this and binary is insignificant, and the benefits of having line
oriented data are huge)

I assume that by "8bit" here you mean something very similar to yEnc,
namely an encoding that only escapes those characters known to be unsafe
in a nearly binary-clean environment (generally NUL, CR, LF, and an escape
character) and adds CRLF pairs every so often.  If that's what you meant,
I recommend not using the term "8bit", as the MIME definition of the 8bit
CTE is something rather different.

and gzip-base85 (the EBCDIC concerns that drove the use of a 64
character alphabet versus an 85 character alphabet seem to be one of the
few things that are truly no longer a concern for email).

Personally, I really don't see base85 as worth the work.  The amount of
gain over base64 is relatively small, and I doubt there are many
environments that will both implement a brand new encoding and that can't
use something similar to yEnc.

But I wouldn't argue strongly against it.

If something like yEnc is introduced as another CTE, we'd have to
introduce it twice, once for uncompressed content and once for
compressed content... but I'm not sure that's really a problem.

Why bother? I believe it is easy to generate "uncompressed gzip" if you
really want to. And I'd rather solve the problem yEnc seeks to solve as
part of all this...

In other words, your proposal would be to introduce a new yEnc-like
encoding *only* allowing compressed data, requiring all data encoded with
that CTE to be compressed or at least appear to be compressed?

I don't really like that idea, but I'm not sure I can put my finger on
why.  I guess it just seems unclean to "compress" data that can't actually
be compressed just so as to fulfill the requirements of CTE (such as with
a JPEG image or an MP3 file).

I do think that we probably at least need to consider the need for two
types of compression.  gzip is a great default choice, sitting in a
fairly sweet spot between time and space tradeoffs, but bzip2 produces
significantly better compression if you're willing to take ~10x as long
to compress and I expect that people will start asking for it fairly
quickly.

Humpf. I see the rationale, but I'm really uncomfortable with more than
one compression.

Another factor worth considering here is sleeper patents, although gzip is
probably safe from that given how long it's been in use.  But compression
is a patent-rich field, so allowing for the possibility of more than one
algorithm may help in making the protocol a bit more law-proof.

Well, there's a bit of a mess in the case of gzip-8bit and email and
downgrading, but yes, gzip-base85 or gzip-base64 can both be managed
IMO.

Aren't these issues essentially identical to the issues posed by RFC 3030
and able to be handled in the same way?  I presume that the yEnc-like
encoding that would be added could be handled by any server advertising
8BITMIME, and mail systems that want to worry about non-8bit-clean transit
paths could recode just like they would for a CTE of 8bit (although in
this case recoding to base64 is the obvious approach).

-- 
Russ Allbery (rra(_at_)stanford(_dot_)edu)             
<http://www.eyrie.org/~eagle/>