ietf-822
[Top] [All Lists]

Re: gzip-8bit

2003-02-24 05:08:51

Juergen Helbing wrote:

And because it seems necessary to re-specify yEnc entirely before it
is used as an official MIME encoding this might be the right time to
learn from the past and to do it better now.

Here's some procrastination (a gzip-8bit CTE) from my procrastination (a
new Usefor draft) from my real work.  As always, comments are
appreciated.

How amusing. I just submitted an Internet Draft describing almost exactly the
same thing. I've attached a copy of what I wrote below.

So who wants to do  the merge?

                                Ned



Network Working Group                                           N. Freed
Internet-Draft                                          Sun Microsystems
Expires: August 25, 2003                               February 24, 2003


              Deflate-8bit and Deflate-base64: Compression
                  Content-Transfer-Encodings for MIME
                     draft-freed-mime-newenc-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at http://
   www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on August 25, 2003.

Copyright Notice

   Copyright (C) The Internet Society (2003).  All Rights Reserved.

Abstract

   This document defines two additional MIME content-transfer-encodings,
   deflate-8bit and deflate-base64.  Adding these CTEs to MIME that
   provide facilities for loss-less, adaptive, general-purpose
   compression.  The first of these, deflate-8bit, produces 8bit output,
   while the second, deflate-base64, produces the same sort of output as
   the base64 content-transfer-encoding defined in RFC 2045.








Freed                   Expires August 25, 2003                 [Page 1]

Internet-Draft              Compression CTEs               February 2003


1. Introduction

   The MIME specification RFC 2045 [2] defines several
   Content-Transfer-Encodings:

   1.  7bit, used to label textual 7bit data,

   2.  8bit, used to label textual 8bit data,

   3.  binary, used to label binary data,

   4.  quoted-printable, normally used to transform 8bit textual data to
       7bit form, and

   5.  base64, normally used to transform binary data to 7bit form.

   All of these encodings produce output that greater than or equal to
   the input data in length.  In particular, quoted-printable can incur
   up to 300% overhead and base64 incurs a fixed 33% overhead.  This
   amount of overhead can be significant in some applications.

   This document defines two new CTEs that incorporate the popular
   deflate compression algorithm described in RFC 1951 [1].  The first
   of these, deflate-8bit, also incorporates a lightweight encoding
   based on the popular yEnc [6] encoding scheme.  The resulting
   material is often smaller than the input even when the output range
   is restricted to the base64 alphabet.
























Freed                   Expires August 25, 2003                 [Page 2]

Internet-Draft              Compression CTEs               February 2003


2. Conventions Used In This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [3].














































Freed                   Expires August 25, 2003                 [Page 3]

Internet-Draft              Compression CTEs               February 2003


3. Deflate Compression

   The deflate compression format described in RFC 1951 [1], as used by
   the PKZIP and gzip compressors and as embodied in the freely and
   widely distributed zlib [Gailly95] library source code, has the
   following features:

   o  An apparently unencumbered encoding and compression algorithm,
      with an open and publicly-available specification.

   o  low-overhead escape mechanism for incompressible data,

   o  heavily used for many years in networks, on modem and other
      point-to-point links to transfer files for personal computers and
      workstations,

   o  easily achieves 2:1 compression on the Calgary corpus [5] using
      less than 64KBytes of memory on both sender and receive.

































Freed                   Expires August 25, 2003                 [Page 4]

Internet-Draft              Compression CTEs               February 2003


4. The Deflate-8bit Content-Transfer-Encoding

   The deflate-8bit encoding process consists of applying the deflate
   algorithm defined in RFC 1951 [1] to a MIME object in canonical form.
   Since all MIME objects are potentially independent of each other the
   compressor's history MUST be cleared prior to performing the
   compression operation.

   The output of dellfate algorithm is binary data.  This binary data is
   then encoded as follows:

   1.  42 is added to each octet modulo 256.

   2.  If the resulting octet has the decimal value 61 (equals sign), 13
       (CR), 10 (LF), or 0 (NULL) it must be escaped.  This is done by
       prefixing the octet with an octet of value 64 and adding 64 to
       the resulting octet modulo 256.

   3.  A CRLF sequence MUST be inserted into the output after every 256
       octets of output.  A trailing CRLF MUST also appear at the end of
       the data.  Implementations MUST be tolerant of CRLFs being
       inserted between the escape prefix and the octet it modifies.

   4.  Additional octets MAY be escaped using the previously described
       procedure.  In particular, implementations SHOULD escape any
       octets with the values 32 (space) or 9 (tab) that would otherwise
       appear at the end of a line.

   8bit data is produced by this encoding process.  As such, this CTE
   can only be used in conjunction with transports capable of handling
   8bit data.

   Decoding consists simply of reversing the encoding process, that is,
   first reversing the encoding described above and then applying the
   inflate algorithm.
















Freed                   Expires August 25, 2003                 [Page 5]

Internet-Draft              Compression CTEs               February 2003


5. The Deflate-Base64 Content-Transfer-Encoding

   The deflate-base64 encoding process consists of applying the deflate
   algorithm defined in RFC 1951 [1] to a MIME object in canonical form.
   The result of the deflate compression operation is then further
   encoded using the base64 scheme defined in RFC 2045 [2].  Since all
   MIME objects are potentially independent of each other the
   compressor's history MUST be cleared prior to performing the
   compression operation.

   The output of this process has the same range as the base64 CTE, and
   can be used with any transport.

   Decoding consists simply of reversing the encoding process, that is,
   first reversing the base64 encoding and then applying the inflate
   algorithm.



































Freed                   Expires August 25, 2003                 [Page 6]

Internet-Draft              Compression CTEs               February 2003


6. Appropriate Use

   As deflate-8bit produces 8bit material as output, it MUST NOT be used
   with transports that do not support 8bit, such as tranditional SMTP.
   Happily, most SMTP transports currently support the 8bitMIME SMTP
   extension and hence can accomodate the use of deflate-8bit

   Both deflate-8bit and deflate-base64 SHOULD only be used when the
   originator has some indication that the recipient can decode them.
   Note that this document does not specify a means by which such
   support can be indicated.








































Freed                   Expires August 25, 2003                 [Page 7]

Internet-Draft              Compression CTEs               February 2003


7. Security Considerations

   The deflate algorithm is complex and hence prone to implementation
   errors.  In particular, certain inflate implementations are known to
   not perform sufficient checking of their input stream and hence may
   be vulnerable to certain forms of attack.  Aside from this, the new
   content-transfer-encodings specified in this document are believe not
   to raise any security considerations not already present in MIME
   itself.










































Freed                   Expires August 25, 2003                 [Page 8]

Internet-Draft              Compression CTEs               February 2003


Normative References

   [1]  Deutsch, P., "DEFLATE Compressed Data Format Specification
        version 1.3", RFC 1951, May 1996.

   [2]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
        Extensions (MIME) Part One: Format of Internet Message Bodies",
        RFC 2045, November 1996.

   [3]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.








































Freed                   Expires August 25, 2003                 [Page 9]

Internet-Draft              Compression CTEs               February 2003


Informative References

   [4]  Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet
        Mail Extensions (MIME) Part Four: Registration Procedures", BCP
        13, RFC 2048, November 1996.

   [5]  Bell, T. and I. Witten, "Text Compression", Prentice-Hall
        Englewood Cliffs NJ, 1990.

   [6]  Helbing, J., "yEncode - A quick and dirty encoding for
        binaries", http://www.yenc.org/yenc-draft.1.3.txt version 1.3,
        2002.


Author's Address

   Ned Freed
   Sun Microsystems
   1050 Lakes Drive
   West Covina, CA  91790
   USA

   Phone: +1 626 850 4350
   EMail: ned(_dot_)freed(_at_)mrochek(_dot_)com



























Freed                   Expires August 25, 2003                [Page 10]

Internet-Draft              Compression CTEs               February 2003


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   intellectual property or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; neither does it represent that it
   has made any effort to identify any such rights.  Information on the
   IETF's procedures with respect to rights in standards-track and
   standards-related documentation can be found in BCP-11.  Copies of
   claims of rights made available for publication and any assurances of
   licenses to be made available, or the result of an attempt made to
   obtain a general license or permission for the use of such
   proprietary rights by implementors or users of this specification can
   be obtained from the IETF Secretariat.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights which may cover technology that may be required to practice
   this standard.  Please address the information to the IETF Executive
   Director.


Full Copyright Statement

   Copyright (C) The Internet Society (2003).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assignees.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION



Freed                   Expires August 25, 2003                [Page 11]

Internet-Draft              Compression CTEs               February 2003


   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.











































Freed                   Expires August 25, 2003                [Page 12]


<Prev in Thread] Current Thread [Next in Thread>