To: Dave Crocker <dcrocker(_at_)mordor(_dot_)stanford(_dot_)edu>
Subject: Re: Massive Content-Type definition ideas & Gopher
Date: Wed, 9 Jun 93 10:02:29 PDT
How well do the wider character sets compress with LZU87? (is that the
right algorithm identifier?) They might well compress at a greater ratio
since their strings of repeated bits should be much longer than in ASCII.
It appears that LZ77 would work well with wide charsets.
A compromise which came to me this morning is:
- Go ahead and define COMPRESSED-<encoding> encodings.
- Define these to be a particular algorithm and either directly document
the algorithm or put a reference to the paper. Is the gzip algorithm
(LZU87?) the right one to use?
The gzip compression algorithm is a refinement LZ77, and is the same
'deflate' algorithm used by PKzip, info-zip and probably other *zip
programs. (The file formats, however, are a bit different.)
- Define an *optional* parameter, `compression-algorithm=', to be used
if/when other compression algorithms are available. If not given then the
algorithm defaults to the one defined above. The compression-algorithm
names may well become Yet Another List which IANA governs.
I don't think we need another parameter. Even if we eventually have two or
three of them, the number of compression algorithms used in MIME is best
kept small. If we keep the number small, we may as well define additional
content-transfer-encodings to handle them.