Juergen Helbing wrote:
And because it seems necessary to re-specify yEnc entirely before it
is used as an official MIME encoding this might be the right time to
learn from the past and to do it better now.
Here's some procrastination (a gzip-8bit CTE) from my procrastination (a
new Usefor draft) from my real work. As always, comments are
appreciated.
How amusing. I just submitted an Internet Draft describing almost exactly the
same thing. I've attached a copy of what I wrote below.
So who wants to do the merge?
Ned
Network Working Group N. Freed
Internet-Draft Sun Microsystems
Expires: August 25, 2003 February 24, 2003
Deflate-8bit and Deflate-base64: Compression
Content-Transfer-Encodings for MIME
draft-freed-mime-newenc-00.txt
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://
www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 25, 2003.
Copyright Notice
Copyright (C) The Internet Society (2003). All Rights Reserved.
Abstract
This document defines two additional MIME content-transfer-encodings,
deflate-8bit and deflate-base64. Adding these CTEs to MIME that
provide facilities for loss-less, adaptive, general-purpose
compression. The first of these, deflate-8bit, produces 8bit output,
while the second, deflate-base64, produces the same sort of output as
the base64 content-transfer-encoding defined in RFC 2045.
Freed Expires August 25, 2003 [Page 1]
Internet-Draft Compression CTEs February 2003
1. Introduction
The MIME specification RFC 2045 [2] defines several
Content-Transfer-Encodings:
1. 7bit, used to label textual 7bit data,
2. 8bit, used to label textual 8bit data,
3. binary, used to label binary data,
4. quoted-printable, normally used to transform 8bit textual data to
7bit form, and
5. base64, normally used to transform binary data to 7bit form.
All of these encodings produce output that greater than or equal to
the input data in length. In particular, quoted-printable can incur
up to 300% overhead and base64 incurs a fixed 33% overhead. This
amount of overhead can be significant in some applications.
This document defines two new CTEs that incorporate the popular
deflate compression algorithm described in RFC 1951 [1]. The first
of these, deflate-8bit, also incorporates a lightweight encoding
based on the popular yEnc [6] encoding scheme. The resulting
material is often smaller than the input even when the output range
is restricted to the base64 alphabet.
Freed Expires August 25, 2003 [Page 2]
Internet-Draft Compression CTEs February 2003
2. Conventions Used In This Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [3].
Freed Expires August 25, 2003 [Page 3]
Internet-Draft Compression CTEs February 2003
3. Deflate Compression
The deflate compression format described in RFC 1951 [1], as used by
the PKZIP and gzip compressors and as embodied in the freely and
widely distributed zlib [Gailly95] library source code, has the
following features:
o An apparently unencumbered encoding and compression algorithm,
with an open and publicly-available specification.
o low-overhead escape mechanism for incompressible data,
o heavily used for many years in networks, on modem and other
point-to-point links to transfer files for personal computers and
workstations,
o easily achieves 2:1 compression on the Calgary corpus [5] using
less than 64KBytes of memory on both sender and receive.
Freed Expires August 25, 2003 [Page 4]
Internet-Draft Compression CTEs February 2003
4. The Deflate-8bit Content-Transfer-Encoding
The deflate-8bit encoding process consists of applying the deflate
algorithm defined in RFC 1951 [1] to a MIME object in canonical form.
Since all MIME objects are potentially independent of each other the
compressor's history MUST be cleared prior to performing the
compression operation.
The output of dellfate algorithm is binary data. This binary data is
then encoded as follows:
1. 42 is added to each octet modulo 256.
2. If the resulting octet has the decimal value 61 (equals sign), 13
(CR), 10 (LF), or 0 (NULL) it must be escaped. This is done by
prefixing the octet with an octet of value 64 and adding 64 to
the resulting octet modulo 256.
3. A CRLF sequence MUST be inserted into the output after every 256
octets of output. A trailing CRLF MUST also appear at the end of
the data. Implementations MUST be tolerant of CRLFs being
inserted between the escape prefix and the octet it modifies.
4. Additional octets MAY be escaped using the previously described
procedure. In particular, implementations SHOULD escape any
octets with the values 32 (space) or 9 (tab) that would otherwise
appear at the end of a line.
8bit data is produced by this encoding process. As such, this CTE
can only be used in conjunction with transports capable of handling
8bit data.
Decoding consists simply of reversing the encoding process, that is,
first reversing the encoding described above and then applying the
inflate algorithm.
Freed Expires August 25, 2003 [Page 5]
Internet-Draft Compression CTEs February 2003
5. The Deflate-Base64 Content-Transfer-Encoding
The deflate-base64 encoding process consists of applying the deflate
algorithm defined in RFC 1951 [1] to a MIME object in canonical form.
The result of the deflate compression operation is then further
encoded using the base64 scheme defined in RFC 2045 [2]. Since all
MIME objects are potentially independent of each other the
compressor's history MUST be cleared prior to performing the
compression operation.
The output of this process has the same range as the base64 CTE, and
can be used with any transport.
Decoding consists simply of reversing the encoding process, that is,
first reversing the base64 encoding and then applying the inflate
algorithm.
Freed Expires August 25, 2003 [Page 6]
Internet-Draft Compression CTEs February 2003
6. Appropriate Use
As deflate-8bit produces 8bit material as output, it MUST NOT be used
with transports that do not support 8bit, such as tranditional SMTP.
Happily, most SMTP transports currently support the 8bitMIME SMTP
extension and hence can accomodate the use of deflate-8bit
Both deflate-8bit and deflate-base64 SHOULD only be used when the
originator has some indication that the recipient can decode them.
Note that this document does not specify a means by which such
support can be indicated.
Freed Expires August 25, 2003 [Page 7]
Internet-Draft Compression CTEs February 2003
7. Security Considerations
The deflate algorithm is complex and hence prone to implementation
errors. In particular, certain inflate implementations are known to
not perform sufficient checking of their input stream and hence may
be vulnerable to certain forms of attack. Aside from this, the new
content-transfer-encodings specified in this document are believe not
to raise any security considerations not already present in MIME
itself.
Freed Expires August 25, 2003 [Page 8]
Internet-Draft Compression CTEs February 2003
Normative References
[1] Deutsch, P., "DEFLATE Compressed Data Format Specification
version 1.3", RFC 1951, May 1996.
[2] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message Bodies",
RFC 2045, November 1996.
[3] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997.
Freed Expires August 25, 2003 [Page 9]
Internet-Draft Compression CTEs February 2003
Informative References
[4] Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet
Mail Extensions (MIME) Part Four: Registration Procedures", BCP
13, RFC 2048, November 1996.
[5] Bell, T. and I. Witten, "Text Compression", Prentice-Hall
Englewood Cliffs NJ, 1990.
[6] Helbing, J., "yEncode - A quick and dirty encoding for
binaries", http://www.yenc.org/yenc-draft.1.3.txt version 1.3,
2002.
Author's Address
Ned Freed
Sun Microsystems
1050 Lakes Drive
West Covina, CA 91790
USA
Phone: +1 626 850 4350
EMail: ned(_dot_)freed(_at_)mrochek(_dot_)com
Freed Expires August 25, 2003 [Page 10]
Internet-Draft Compression CTEs February 2003
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
intellectual property or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; neither does it represent that it
has made any effort to identify any such rights. Information on the
IETF's procedures with respect to rights in standards-track and
standards-related documentation can be found in BCP-11. Copies of
claims of rights made available for publication and any assurances of
licenses to be made available, or the result of an attempt made to
obtain a general license or permission for the use of such
proprietary rights by implementors or users of this specification can
be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights which may cover technology that may be required to practice
this standard. Please address the information to the IETF Executive
Director.
Full Copyright Statement
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assignees.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
Freed Expires August 25, 2003 [Page 11]
Internet-Draft Compression CTEs February 2003
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
Funding for the RFC Editor function is currently provided by the
Internet Society.
Freed Expires August 25, 2003 [Page 12]