procmail
[Top] [All Lists]

Re: Parsing dates

2004-07-28 01:50:29
On Sun, Jul 25, 2004 at 04:39:54AM -0600, Google Kreme wrote:

No, the Date: filed is wholly untrustable.

Well, you can adapt the script I posted to scan the Date header then. 
It will be hard because you can't predict the Date: header format with
certitude. Try doing a grep -e "^Date:" on your mail and just do a
quick visual scan to see how the formats stack up.

Here's a sample from my own mailboxes of some differing formats:

List Replies:Date: Tue, 02 Mar 2004 11:24:15 +1300
List Replies:Date: Wed, 03 Mar 2004 19:16:00 -0600
List Replies:Date: Wed, 3 Mar 2004 20:20:15 -0500 (EST)
List Replies:Date: Fri, 19 Mar 2004 05:50 -0800
List Replies:Date: Thu,  1 Apr 2004 23:11:29 +1000  
    (NB: there's a pad space before the 1)
List Replies:Date: 6 Apr 2004 10:32:04 -0000
List Replies:Date: 22 Mar 2004 10:08:02 UT
List Replies:Date: Wed, 7 Apr 2004 09:18:03 -0400 (GMT-04:00)
List Replies:Date:         Thu, 22 Jul 2004 16:13:27 +0400
   (NB: Yes, all those spaces)

Notice that in each case above the year directly precedes
the HH:MM:SS.  That's why I anchored on that in my approach.
Even with two-digit years, that is highly likely to hold.
And if not, well, we just end up with a few messages whose
date is unparseable, and we file them accordingly.

-- 
dman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail