mail-ng
[Top] [All Lists]

Dates: the can of worms

2004-04-30 21:28:16

On Sat, 1 May 2004 06:58, Iljitsch van Beijnum wrote:
An interesting question is whether it's better to implement a binary
date format as a timestamp or a concatnation of
year/month/day/hour/second fields.

I can see that you've thought about this, but it seems you still haven't quite 
noticed how full of worms the "dates" can is. The last time I studied up on 
dates, I was prompted to write a short essay about it, which you may find 
amusing (and even informative).

  http://www.nutters.org/log/dating-game

(BTW I once implemented a date format as a floating point value
counting the days since Y2K, which gives good precision right now but
still allows dates far in the past and future. And ignoring the leap
second problem is easier.)

That's actually not a bad approach, and ignoring the leap seconds problem 
isn't as evil as some might think. We actually have (at least) two 
"universal" times: UTC and UT1. UT1 is a purely solar time, and always has 
86400 "seconds" per day. The fundamental unit of UT1 is the "day", where 
"day" is a rotation of the Earth relative to the sun, and all the smaller 
components are constant fractions of that day.

In UTC, the fundamental unit is the "atomic second" (which is what the SI 
nomenclature considers a "second" to be). Given the current rules of UTC, a 
"day" is 86400 seconds, with an optional one-second adjustment (plus or 
minus) on leap second days, which are presently at the end of March, June, 
September, and December, with the June and December days given higher 
priority. I say "presently", because the whole thing is defined by a 
committee which may well decide to change its collective mind in the future 
(as it has done in the past). Leap seconds are inserted (or removed) from UTC 
in order to keep it within 0.9s of UT1. It is not reasonable to project UTC 
dates into the future (beyond the next leap second day), because leap seconds 
are inserted or removed based on the vagaries of the Earth's rotation at the 
time.

There's also TAI, which is UTC without the leap seconds. That is, it is based 
around the atomic second with exactly 86400 atomic seconds per day, but it 
drifts with respect to the solar day because it lacks leap seconds.

The above descriptions are a rather condensed version of the information given 
at the following page.

  http://tycho.usno.navy.mil/leapsec.html

Given the purpose and nature of date-stamps in email, I would advocate using 
UT1 as the time base. UTC is a PITA because of arbitrary leap seconds, and we 
don't have any use for the precision of atomic seconds in this application. 
TAI isn't useful for civilian timekeeping. In UT1 there are always 86400 
seconds in a day (simple! predictable!), and it provides a date-stamp which 
is with 0.9s of UTC, which strikes me as more than good enough for the job. 
(Please note that the "second" component of an RFC2822 time-of-day production 
is already optional.)

Rather than use the number of days since the year 2000, however, I suggest 
using the Modified Julian Day, simply because it's a standard epoch that 
someone else already invented. A date can be expressed in the form "MJD
+53125.1666", and this would have meaning to historians who already use MJD. 
This format is capable of covering the entire history of the human race, it 
isn't tied to a particular calendar, and it can be parsed with scanf(). The 
format generalises to other epoch-offsets, such as Unix time_t, which could 
be expressed "U+1083384342". (The two dates expressed aren't exactly the 
same, and I don't advocate actually using the Unix time_t epoch. I merely 
demonstrate that this is a general epoch-offset format, not specific to MJD.)

I have sample Perl code for conversion between MJD and Gregorian dates if 
anyone wants to see it. It's nothing spectacular.


<Prev in Thread] Current Thread [Next in Thread>