Background:
1. Outlook 2000 mail client
2. IMAP mail delivery (dovecot)
3. Outlook mail rule enabled that saves all outgoing messages
into a Sent-mail folder on the IMAP server (in mbox format)
4. A script that runs once a day, which trims mailboxes that
exceed $maxmsgs back to $minmsgs, and archives the excess
into archive/<mailbox>
5. Formail version: v3.22 2001/09/10.
The archiving step looks like this, in csh syntax:
set nmsgs = `formail -X 'From ' -s < $f | wc -l`
if ($nmsgs > $maxmsgs) then
set arc = archive/${f}
if (-e $arc) then
echo "" >> $arc
endif
@ cnt = $nmsgs - $minmsgs
formail -$cnt -s < $f >> $arc
formail +$cnt -s < $f > ${f}.tmp
touch -r ${f} ${f}.tmp
mv -f ${f}.tmp ${f}
chmod 600 ${f} $arc
endif
Problem: the problem that arises is that both the archived mailbox and
the truncated mailbox have extraneous '>' characters placed in front of
valid From_ lines. This happens because Outlook in its wisdom doesn't
add a empty line after the end of the messages that it writes into
Sent-mail.
Thus, the Sent-mail file might have some lines that look like this
(note that From_ has been changed Zrom so that it won't be escaped, and
thus compound the confusion. Read Zrom as From_):
Zrom gary(_at_)excample(_dot_)com Sun May 28 13:44:07 2006
To: "Fred" <fred(_at_)example(_dot_)com>
Subject: bbq
Date: Sun, 28 May 2006 13:44:07 -0700
Message-ID: <002601c68297$76e305a0$6401a8c0(_at_)EXAMPLE>
Importance: Normal
X-OlkEid: 360420A2F0AD4B7D0B6C3B4C8957146374797E9A
X-UID: 23065
Status:
X-Keywords:
Content-Length: 42
You bring the beer, we'll bring the brots?
Zrom gary(_at_)example(_dot_)com Sun May 28 13:44:07 2006
To: "George" <george(_at_)example(_dot_)com>
Subject: bbq
Date: Sun, 28 May 2006 13:44:07 -0700
Message-ID: <002601c68297$76e305a0$6401a8c0(_at_)EXAMPLE>
MIME-Version: 1.0
X-UID: 23065
X-OlkEid: 362420A2CC21FFA10546584288B5DBBCCBE739C1
Status:
X-Keywords:
Content-Length: 15
Tomorrow at 1pm
<End of File>
As it turns out, "formail -s" will place a '>' in front of the second From_,
apparently because there is no intervening new line to terminate the
message body? However, the Content-Length of 42 on the first message is
apparently correct in that it counts the 42 characters in the message
(not inclusive of the final new line).
-------------
A few questions:
1. Is the initial message RFC compliant? Is the Content-Length correct not
to
include the final newline of the message body? Should the message body
always
be terminated with an empty line (yet is not in this example)?
2. If the answer to 1. above is "yes", then shouldn't formail have honored
the
Content-Length field, and noticed that the second From_ is part of a new
message?
The formail man page says the following:
If a Content-Length: field is found in a header, formail will
copy the number of specified bytes in the body verbatim before
resuming the regular scanning for message boundaries (except when
splitting digests or Berkeley mailbox format is assumed).
(note: If you change the Zrom's above into From's, and run the result
through "formail -s", you should be able to duplicate the scenario described
above.)
Given that the behavior of Outlook is immutable, what's the best corrective
course of action? Note that -ds works no better than -s in this example,
even
though it should ignore the Content-Length field, and be able to find enough
header lines to convince itself that a new message has started. When the
first message is terminated with an empty line, all is well however.
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail