procmail
[Top] [All Lists]

Re: extracting the date

2003-01-18 09:09:39
LuKreme <kremels(_at_)kreme(_dot_)com> wrote:

Now that I have a better understanding of what I'm doing in procmail I
decided to go back through one of my old archives and see if I could
extract the messages into date folders

:0
* ^Date:......\/...........
{
    MYMATCH=`echo $MATCH |sed -e 's/ /-/g'`
    MYFOLDER=Chat-l.$MYMATCH
    LOG=$MYFOLDER$NL

   :0
    $MYMATCH
}

Yes, I know, this is not the right way. [. . .]

Bit of an undertstatement, yeah.  :-)  You're running echo and sed
in a pipe when just running date would be cheaper.  Anyway, your
algorithm is not robust.  I ran it on th 102 messages sitting in
my *good* folder (where one would expect the Date: header to
have some resemblance to RFC-conformity, heh); then ran a distribution
(via an alias in my shell) on the result:

 4:42pm [~/Mail] 590[0]> harness goodmail | grep MYF | distrib
  23 procmail: Assigning "MYFOLDER=Chat-l.15-Jan-2003"
  20 procmail: Assigning "MYFOLDER=Chat-l.16-Jan-2003"
  20 procmail: Assigning "MYFOLDER=Chat-l.17-Jan-2003"
   4 procmail: Assigning "MYFOLDER=Chat-l.18-Jan-2003"
   1 procmail: Assigning "MYFOLDER=Chat-l.n-2003-03:4"
   1 procmail: Assigning "MYFOLDER=Chat-l.n-2003-18:2"
   1 procmail: Assigning "MYFOLDER=Chat-l.n-2003-19:2"

So we have three bogus outliers in the "good" mail.  What will happen
if I run this on spam?  I don't even want to know.  :-)


But we don't need echo, sed, or even date, anyway.
First off, you'd almost certainly want to use the From_
header to grab this info rather than the Date: one.

Here's some stuff from my genvars INCLUDERC that will get
you what you need (and more).

 4:53pm [~/.procmail/vars] 610[0]> grep Fri genvars; tail -45 genvars
 AWEEKDAY      = (Mon|Tue|Wed|Thu|Fri|Sat|Sun)

 MMtable = "{ Jan:01 Feb:02 Mar:03 Apr:04 May:05 Jun:06
              Jul:07 Aug:08 Sep:09 Oct:10 Nov:11 Dec:12 }"


 :0  # 030110 () save tail-end of FROM_ in order to parse DATE
  * $  ^^From .* \/$AWEEKDAY .+
  { FROM_ = $MATCH
  
     :0  # 021211 () find year in FROM_
      * FROM_ ?? ()\/....^^
      { THISYEAR = $MATCH }

     :0  # 021211 () find YY
      * THISYEAR ?? ()\/..^^
      { YY = $MATCH }

     :0  # 021211 () find MONTH
      * FROM_ ?? () \/...
      { MONTH = $MATCH }

     :0  # 021211 () find MM
      * $  MMtable ?? ()\<$MONTH:\/[01][0-9]
      { MM = $MATCH }

     :0  # 021212 () find DD
      * $ FROM_ ?? ()\<$MONTH +\/[1-3]?[0-9]
      *   MATCH ?? ^^..
      { DD = $MATCH }

     :0 E  # 021211 () replace space as necessary
      { DD = 0$MATCH }

     DATE = $YY$MM$DD
  }


 :0  # 021230 () find next year (four digits; let's not worry about Y3K prob!)
  * $  $THISYEAR^0
  *            1^0
  { NEXTYEAR = $= }

 :0  # 021230 () find next year (two digits)
  * NEXTYEAR ?? ()\/..^^
  { NEXTYY = $MATCH }

-- 
dman


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>