On Sat, 11 Oct 2003, Lukreme wrote:
I have a bunch of HTML files that I want to compile into a mbox file,
providing a set of static headers, but extracting the Name and Date
from the <title> tag of the file.
All the Title tags are in the form of <title>A name/dd-mmm-yy</title>
and I would like to feed these files to formail and have it create a
set of headers something like this:
I don't think you need formail for this. A simple shell loop should
be sufficient (use of fgrep assumes each title is on a line by itself):
fgrep '<title>' *.html /dev/null |
while IFS='<>/' read filename tag subject date rest
do
filename=${filename%:*}
dd=${date%%-*}
mmm=${date#*-}
mmm=${mmm%-*}
yy=${date##*-}
if [ $yy -lt 70 ]; then yyyy=20$yy; else yyyy=19$yy; fi
cat - $filename <<-EOF
From staticaddress(_at_)domain(_dot_)com $mmm $dd 00:00:01 $yyyy
To: someaddress(_at_)domain(_dot_)com
Subject: $subject
From: staticaddress(_at_)domain(_dot_)com
Date: $dd $mmm $yyyy 00:00:01 +0000
Content-Type: text/html
Mime-Version: 1.0
Status: RO
EOF
done > yourmboxfile
Note that to really be correct you need to compute the day of the week
somehow, and insert that into the From_ and Date: lines:
From staticaddress(_at_)domain(_dot_)com $DOW $mmm $dd 00:00:01 $yyyy
...
Date: $DOW, $dd $mmm $yyyy 00:00:01 +0000
...
If you don't have a POSIX shell that can handle dd=${date%%-*} and so on,
you can play games with IFS like so:
ifs="$IFS"
IFS=-
set $date
IFS="$ifs"
dd=$1
mmm=$2
yy=$3
(And similarly with IFS=: to trim the trailing colon off $filename.)
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail