Claudiu Bosioc <cbosioc(_at_)uem(_dot_)utt(_dot_)ro> writes:
I'm developing a webmail and I am using formail to parse and extract
emails from the folders.
let's say that I have 21 email in folder inbox.
I'm using this command to extract the from's from the inbox:
$ cat inbox | formail -ds formail -x From:
and it output's me 21 lines.
Actually, you can save a bunch of time by doing all the work with one
formail process. Formail can perform header munging at the same time
as mailbox splitting:
formail -d -xFrom: -s <inbox
I assume you need the -d flag to deal with the mailbox format used
by webmail, no?
I need to extract also the subject lines, and I would do it like this:
$ cat inbox | formail -ds formail -x Subject:
this, the output is only 16 lines long because some emails don't have a
subject.
I need formail to output a blank line instead of nothing, so the output
would be 21 lines, same as above.
Okay, just tell formail to add such a header if it doesn't already exist,
and then do the extract.
formail -d -aSubject: -xSubject: -s <inbox
Now for two tricky points you may need to watch:
1) You probably want to use the -c flag to have formail join continued
header lines into one. There's no way to do this yourself because
with the -x flag you don't have enough information to identify a
continued header.
2) You probably also want to use either the -U -or -u flag to prevent
problems with messages that have multiple Subject: header fields.
Yes, these messages are broken (well, the meaning of the duplicate
is not defined), but you shouldn't let them break your software.
So:
formail -dc -aSubject: -USubject: -xSubject: -s <inbox
Make sense?
Philip Guenther