ietf-822
[Top] [All Lists]

encodings, base-64, etc.

1991-10-29 12:10:17
Folks interested in encodings of this ilk might want to look at the
bencode/bdecode package we ship with C News.  It's basically uuencode
done right...

Is there a published spec for this?  Sounds like it would have been nice
to learn about this sooner...

No real specs on the format itself.  A quick look at the code reveals
that it's simpler than I thought.  Ignoring trivia like header conventions:

- The main alphabet is "A-Za-z0-9+-", 64 characters total.

- Data is encoded by assembling three 8-bit bytes into 24 bits, filling
        the high end first, and then putting this out as four characters
        conveying six bits each, starting from the high end.

- Lines are broken after the 77th character on each, to keep length down.

- At end of data, any partial line is ended and two terminating lines
        are generated.  The first consists of a slash "/", a one-digit
        number indicating how many bytes are left after the last 24-bit
        clump, and the hexadecimal values of those bytes.  The second
        consists of a decimal total byte count, a space, and a hex 16-bit
        CRC which the comments claim to be CRC-16.

Receiving basically just reverses the process, ignoring any character
(including newlines) that is not in the main alphabet until the slash
is seen.

This stuff has been used successfully in production news transfers (muchos
megabytes) over Bitnet mail.

                                         Henry Spencer at U of Toronto Zoology
                                          
henry(_at_)zoo(_dot_)toronto(_dot_)edu   utzoo!henry