ietf-822
[Top] [All Lists]

Re: gzip/deflate compression/encoding

2005-06-30 20:50:49

Charles Lindsey wrote:
 
Are the usual audio, image, etc. formats truly 8bit clean
(i.e. are they guaranteed not to contain NUL or naked CR or
LF)?

Of course not, this thread is about introducing yenc as CTE,
in pseudo-REXX (untested, ignoring trailing SP or HT issues):

   CRLF = x2c( 0D0A )
   BAD  = CRLF || x2c( 0 ) || '='
   OUT  = ''
   LIM  = 500 /* REXX idiosyncrasy on my side ;-) */
   
   do while INPUT \== '' /* strict comparison */
      parse var INPUT TOP 2 INPUT
      TOP = d2c( c2d( TOP ) + 64 ) // 256 )
      
      if sign( pos( TOP, BAD )) then OUT = OUT || '='
      OUT = OUT || TOP
       
      if LIM <= length( OUT ) then do
         call charout /* stdout */, OUT || CRLF
         OUT = ''
      end
   end  

If not, then you are back to the 37+% expansion of base64.

The worst case is slightly more than 50% if the entire input
is d2c(192), d2c(202), d2c(205), or d2c(253).  And for runs
of d2c(201) or d2c(224) for the trailing SP / HT issue.

Ignoring the latter using Bruce's formula; 260/256 * 502/500
or less than 102%.  Sometimes it's 503/501, but we don't need
it more precisely while ignoring the trailing HT / SP stuff.

                        Bye, Frank