On Wed, 28 Apr 2004 04:40, Keith Moore wrote:
The RFC 822 date format is specified to a reasonable degree of
precision.
The precision of its specification is not the main issue. The main
issue is that it's a compromise between machine-readable and
human-readable
I agree with that statement- but it's a direct result of the broader
design choice to make email mesages be both machine-readable and
human-readable. Changing the date format without revisiting the larger
choice is probably wasted effort.
The grammar and corresponding parser for an RFC822 date is moderately
complex, compared to, say, two integers which represent the time and
the timezone, although the latter will not be at all meaningful to the
average man in the street. Most programming languages are trivially
capable of converting integers to and from strings, whereas an RFC822
date requires a modest amount of work to generate and/or parse. The
more work is involved, the greater the scope for error in
implementation (and ambiguity in specification).
Not so. The amount of work in decoding an integer (say a UNIX-style
time_t) to a date is approximately equal to the amount of work in
parsing a RFC822 style date, and it's at least as easy to botch
up the decoder as to botch the RFC 822 date parser. The amount of
work in encoding an integer date from yyyy/mm/dd/hh/mm/ss form
is more than the amount of work in encoding that same quantity
in RFC 822 format - and it's harder to get that encoder right.
The advantage of integer dates is that they're easier to manipulate
once you have them in that form - to add offsets to them, compare
them to other dates, subtract one date from another to get the
time difference. That and integer dates are slightly more compact.
Still, I'd agree that if we went to an entirely binary format for mail
(not trying to make it human readable in its on-the-wire form)
then we would probably do well to encode dates as integers.
But I don't think we'd see fewer implementation errors as a result
of that choice.
(one thing that wasn't botched was the use of timezones - it really
does help to know the sender's local time when a message was sent)
Funny -- I've always considered this particular piece of data to be
mostly a nuisance from an implementer's perspective. Do you show dates
in the recipient's time zone, or the time zones of the various
senders?
The sender's time zone.
Not to mention that back-converting a timezone to UTC for
localisation purposes increases the complexity of your time-related
function requirements.
Timezones are a pain. That's inherent in the fact that the earth
rotates and that (most) humans prefer to be active during the day
and sleep at night. I don't think we can fix that by defining
a new email protocol.
My suggestion for a good date format is *one number* which represents
the date as an offset of time units from a given epoch (although I'm
being deliberately non-specific about the time units and the epoch,
and whether the number is integer or floating point). A time zone is a
separate entity in most cases, descriptive of a *place* more than a
time. (It's an offset to convert universal time into time-of-day for
that place, to be precise.)
For the case of a future event, I'd almost agree with you. except
that a time zone is not a place, but a set of rules. You can have
more than one time zone in use at a particular place.
For the case of an event in the present or past, a simple offset
seems sufficient - you really only care about the UTC offset in effect
when the event occurred.
--
--
Regime change 2004 - better late than never.