Folks interested in encodings of this ilk might want to look at the
bencode/bdecode package we ship with C News. It's basically uuencode
done right...
Is there a published spec for this? Sounds like it would have been nice
to learn about this sooner...
No real specs on the format itself. A quick look at the code reveals
that it's simpler than I thought. Ignoring trivia like header conventions:
- The main alphabet is "A-Za-z0-9+-", 64 characters total.
- Data is encoded by assembling three 8-bit bytes into 24 bits, filling
the high end first, and then putting this out as four characters
conveying six bits each, starting from the high end.
- Lines are broken after the 77th character on each, to keep length down.
- At end of data, any partial line is ended and two terminating lines
are generated. The first consists of a slash "/", a one-digit
number indicating how many bytes are left after the last 24-bit
clump, and the hexadecimal values of those bytes. The second
consists of a decimal total byte count, a space, and a hex 16-bit
CRC which the comments claim to be CRC-16.
Receiving basically just reverses the process, ignoring any character
(including newlines) that is not in the main alphabet until the slash
is seen.
This stuff has been used successfully in production news transfers (muchos
megabytes) over Bitnet mail.
Henry Spencer at U of Toronto Zoology
henry(_at_)zoo(_dot_)toronto(_dot_)edu utzoo!henry