Now, the main reason I'm answering you is the CRC issue: We all agreed
that a CRC for base64 would be a good thing, but we were unable to agree
on one. It is, however, the kind of thing that can be phased in later
without much pain, which is why decided to defer it. (Imagine a
"Content-CRC:" header, for example.)
I don't think that would be a too good way to do it, I'd prefer to see it
a part of base64. That way you could "sprinkle" the checksums into the
encoded material as tight or as loose as you want.
Instead of doing this why not use message/partial to break the message up
into pieces that can have independent checksums?
I think the alternatives for quickie algorithms would boil down to "sum"
and "brik". Both are free, widely used and implemented, though brik is
implemented inside a rather large program. Of course, you could go for
some standard CCITT CRC-algorithm, but it should be calculated effectively
with XOR-tables and register shifts.
First of all, the message integrity check (I am not going to call these things
checksums because that's not what they are, and I've had this pointed out to me
in no uncertain terms) issue in MIME is far from dead. We have simply elected
to not address it in the base document. This was resolved at the Santa Fe
meeting, where we concluded that if a simple and nonconvtroversial way of doing
it could be proposed we'd adopt it, otherwise it would have to wait for a
follow-on RFC. Several schemes were proposed subsequently, but none of them
went unchallenged, so the matter was dropped for the base MIME document.
But the issue is still alive and well, and there's no real problem with
addressing it in a separate document. (This does more or less mandate that this
information be placed in a header since we don't want to make a basic change in
the encoding in followon document but I think we can live with this
That having been said, there are loads of other possibilities for a MIC
algorithm. Consider MD2, MD4, and MD5 for starters. Implementation descriptions
as well as nicely commented C code is available for all of these in either RFCs
or RFCs-to-be. There are various CRC algorithms that are even easier to
implement and are described in detail in various other RFCs, and C code is
available for these as well (I don't think code for them has appeared in an
RFC, but I could be wrong about that.)
I've never heard of either "sum" or "brik" before. I'll take you at your word
that they are widely implemented and used, but that's simply not good enough
for use in a standards-track RFC. A detailed description of the algorithm must
be available in a form that can be referenced. Another standard RFC is fine, as
is a readily available book that has been formally published. We could also
elect to describe the algorithm and its implementation ourselves. But reference
to an existing implementation is unacceptable. Period. I don't make the rules
about this, so there's no use complaining to me. If you like you can try
pounding on the IESG and IAB about their requirements in this area. But until
you can either demonstrate that one of these algorithms meets the reference
criteria I've laid out, or you can convince me that the IESG/IAB have changed
their position, I don't propose to consider these algorithms further.
However, there are numerous quality-assurance tests that need to be performed
on a MIC-algorithm-to-be. This is not, properly speaking, a task that's within
the purview of the 822 Working Group. The security working group has already
undertaken this task, and as far as I know has resolved that MD5 is the
algorithm of choice. I think it would be very likely that the IESG would simply
reject the choice of a different algorithm unless it was presented with
substantial evidence of its clear superiority over MD5. Again, this is not
my position, it is simply my reading of the position of others.
Finally, I would not be too surprised if the garbling you've seen is the
result of using uuencode instead of base64. The whole point of base64
is that it is much less prone to being garbled than uuencode! This
doesn't mean that a CRC is not a good idea, merely that the problem
you've had may be further evidence of the desirability of base64.
I doubt that this is the case. If so, I would see the same problems
continuously on the same sites as their feed is screwing it up.
However, I have had several cases where people have mailed me saying
"I never thought I'd do this, but... part x of y of nasty-fornication.jpg
arrived at our site with a wrong checksum. It's never happened before !"
and once I've sent them the damaged part they have had other parts of
the image decode perfect and I've never heard of them since... even though
I've said "don't hesitate to get in touch again if you have these problems".
I don't think this response in any way shows that Nathaniel's original position
is incorrect. There may well be a bug that manifests itself intermittently in
the handling of uuencode that simply would not cause any problems with base64.
As a matter of fact, I know of just such a bug. I occasionally see contatenated
lines in postings, that is, the line break between them has been lost. (No, I
have no idea why this happens, it just does, and I've never noticed any pattern
to it.) Most uudecode implementations will not handle this properly (although
it is possible to make uudecode work correctly in this case -- the simple fact
that some will and some will not makes this even harder to track down). But
this will never cause any problems with a conformant base64 decoder. This
is inherent in the design of the encoding, of course.
But this is all irrelevant. The fact remains that uuencode is not suitable
for the entire e-mail world. It may even be perfectly suitable for news,
but that's a different world (and working group).