procmail
[Top] [All Lists]

Re: Sorting an already existing mbox file, is it possible?

2005-10-07 21:14:12
from "Ruud H.G. van Tol" <rvtol(_at_)isolution(_dot_)nl>

My test84.rc has two lines:
----------------------------
  DEFAULT = '/dev/null'
  LOG     = "$$ "
----------------------------

Run it on your mbox:

  $ formail -s procmail -m test84.rc < mbox094elv


That should show a line with several PIDs, one PID per message in the
mbox.


If not, browse the mbox-file with 'less'. There should be an empty line
before each message header (before the "From " line). If not, see in
`man formail` the options -Y, -e, -d.

If the mbox-file was changed with a text-editor, maybe even on a
Windows-system, then the format of the mbox-file can be corrupted. That
is often easy to repair.

I tried your suggestion and got one PID in the logfile, but subsequently tried
on a file that resulted from a POP3 download to mbox using getmail.  That
produced many PIDs, one for each message, using 

formail -s procmail -m recipe1.rc < bellsouth1

which was that other mbox file.  I got only one PID using only

procmail -m recipe1.rc < bellsouth1

mbox094elv originated from a POP3 download using UKA_PPP 1.7x2 in DOS
(DR-DOS 7.03) which downloaded one message per file: M0001.MSG, M0002.MSG,
M0003.MSG, etc., then after removing spams and separating out HTML
newsletters, I concatenated the rest (in DR-DOR 7.03) with

for %D in (M????.MSG) do cat %D >> mbox094.mes

This resulted in a file of concatenated email messages with no blank line at
the end of each message, and each line ended in ASCII 10, Unix style, not
ASCII 13, 10 (DOS style).  With elvis (text editor, enhanced open-source
vi clone), I was able in a quick way to put a blank line before every line
beginning with "From ", except the first, I think the command within elvis
(this was running in Linux) was

:5,$g/^From /s/^From /\nFrom /

and suddenly I had an mbox file, however there was no date on the "From "
line, it was just "From tmueller(_at_)localhost"; the first message began
"From x_news(_at_)localhost".  The first message was not really e-mail but was 
an
email-style message from the NNTP program that was also part of UKA_PPP 1.7x2.
formail and procmail together recognized only the first message but none of
the others from that file, which was mbox094elv.  So I tried adding a date,
didn't have to be the actual date, using

sed s/From\ tmueller(_at_)localhost/From\ tmueller(_at_)localhost\ Fri\ Sep\ \ 
9\ 2005\ 21:34:51\ 2005/ mbox094elv > mbox094elvd

and then formail -s procmail -m recipe1.rc < mbox094elvd

sorted the messages properly, but

procmail -m recipe1.rc < mbox094elvd

still saw the file as one message.

I used various command-line switches on formail, but still formail and
procmail saw mbox094elv as one message, while nail
(http://nail.sourceforge.net/), using

nail -f mbox094elv

correctly saw 61 messages.

I don't know whether to call this behavior a bug in procmail, or was it
designed that way?

Tom

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>