Re: OT: Re: Less is more
2004-04-30 13:59:02
On 30-apr-04, at 21:13, Keith Moore wrote:
not really, because it's not representative of the kinds of errors
that programmers make when writing 822 date parsers.
Frode's field was syntactically valid (as far as I could see).
If someone of your experience can't say for sure then something isn't
right...
If that field occurred in an actual message generated by an actual MUA
I'd claim it was a programmer error even if it was syntactically valid
:) Anyone who put that in a shipping product ought to be sacked.
And what about the protocol designer? "Fool me once, shame on you. Fool
me twice, shame on me. Fool me 4294967295 times, shame on the IETF."
If a program can't parse Frode's field, it can't parse the RFC822 date
field syntax as specified.
True, but it could quite possibly parse 99.999% of the dates that occur
in actual use, including dates that aren't valid - at which point the
inability to parse dates is insignificant in comparison to failures
that
are due to other problems. If you're concerned about reliability you
care about how well it works in actual use, not whether it handles
really obscure corner cases. (security concerns are an exception -
since crackers specifically look for corner cases.)
This sounds like a decent pragmatic approach to protocols which are so
complex or ambiguously defined that writing code that handles all
possible permutations is infeasible. However, when building a new
protocol it makes sense to create it such that the number of possible
permutations is small enough that it's possible to fully implement and
test them all.
An interesting question is whether it's better to implement a binary
date format as a timestamp or a concatnation of
year/month/day/hour/second fields. The advantage of timestamps is it
makes date comparisons easy and there is no ambiguity as every possible
value is a valid date/time. Testing is also easy because there are only
three exceptional cases: underflow, overflow and wraparound. But the
problem with a timestamp is that it doesn't allow for leap seconds so
the easy math advantage is pretty much fictional and it's hard to
debug. (BTW I once implemented a date format as a floating point value
counting the days since Y2K, which gives good precision right now but
still allows dates far in the past and future. And ignoring the leap
second problem is easier.)
Another issue that is somewhat related: overhead. The message I'm
replying to is 3906 bytes long on my system. 2768 of that is header.
299 of that is date/time information (8 of them) and 398 host
name/address info for 7 "received" lines. Now if we encode this in
binary a "received" line can be timestamp, IP address, IP address = 12
bytes, add another 8 for overhead and that's 140 bytes rather than 665,
which saves 25% header bytes. We can save even more by removing
redundant information such as localhost received lines and software
advertising.
Add to that that a binary format with explicit length values is much
faster to parse (especially on disk where you can seek over large
uninteresting parts) and less inclined to have buffer overflow
problems. On the other hand, the binary format must be simple enough
that it can be implemented and debugged easily to avoid SMNP-like
troubles.
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: Less is more, (continued)
- Re: Less is more, Iljitsch van Beijnum
- Re: Less is more, Keith Moore
- OT: Re: Less is more, Frode Gill
- Re: OT: Re: Less is more, Frode Gill
- Re: OT: Re: Less is more, Keith Moore
- Re: OT: Re: Less is more, Arnt Gulbrandsen
- Re: OT: Re: Less is more, Keith Moore
- Re: OT: Re: Less is more,
Iljitsch van Beijnum <=
- Re: OT: Re: Less is more, Keith Moore
- Dates: the can of worms, Brett Watson
- Re: OT: Re: Less is more, Iljitsch van Beijnum
- Re: Less is more, Martin Duerst
- Re: Less is more, Keith Moore
Re: Less is more, Markus Stumpf
Re: Less is more, Iljitsch van Beijnum
Re: Less is more, Keith Moore
|
|
|