ietf-822
[Top] [All Lists]

Re: Last Call: 'The APPLICATION/MBOX Media-Type' to Proposed Standard

2004-08-29 14:03:15

Whether parsing mboxcl2 takes "much" more code than parsing mboxrd is a subjective measurement. Having written both types of parsers, I don't feel the additional code is significant at all.

The last time I wrote an mbox parser, I wrote it to expect both formats. If I found a content-length header in a message, I'd skip that many bytes. I would then look for the next From_ header. (If there were no content-length header, I just looked for the next From_ header.) This algorithm seemed to work extremely well.

        Tony Hansen
        tony(_at_)att(_dot_)com

D. J. Bernstein wrote:

Charles Lindsey writes:

you find it has been corrupted by all the '>' stuffing


Certainly the old mboxo format should be nuked---but there are other
mbox variants that don't corrupt messages.

The mboxrd format inserts ">" in front of any line beginning "From ",
">From ", ">>From", etc., and then inserts a "From " line at the top.
This is trivially reversible. Readers recognize "From " as the start of
a new message; readers strip ">" from ">From ", ">>From ", etc.

The competition is mboxcl2, which has Content-Length and doesn't have >
quoting. There are two big disadvantages to mboxcl2. First, parsing
mboxcl2 takes much more code than parsing mboxrd. Second, if you write
mboxcl2 and the client expects mboxrd, you're risking having one message
split into two, which is much worse than the maximum damage from writing
mboxrd.

---D. J. Bernstein, Associate Professor, Department of Mathematics,
Statistics, and Computer Science, University of Illinois at Chicago



<Prev in Thread] Current Thread [Next in Thread>