procmail
[Top] [All Lists]

Re: Fixing mail format to RFC 822

1999-09-30 04:13:34
 I have a simple question. How can I fix mailbox format from some unknown
to RFC 822 style? Even better, how can I determine in which mail format is
it?

The file is at http://www.natur.cuni.cz/~mmokrejs/krb4.txt

/bin/Mail -f krb4.txt cannot read it.

The problem is that your initial 'From ' line is escaped with an
angled bracket.

Simplest way is to do 

   perl -e 'while (<STDIN>) { 
s/^>(From\s+\S+\s+\S+\s+\S+\s+\d+\s+\d+:\d+:\d+\s+\d+)/$1/; print $_; }' < 
krb4.txt > krb4.mbox

Of course, there could be a matching From line in the body of your message,
in which case you'd run into trouble.  This is slightly better, but not
much (and kind of icky).

   perl -e 'while (<STDIN>) {$mbox .= $_; } $mbox =~ 
s/(^|\n)>(From\s+\S+\s+\S+\s+\S+\s+\d+\s+\d+:\d+:\d+\s+\d+.*\n\w+:\s+)/$1$2/g; 
print $mbox;' < krb4.txt > krb4.mbox

I couldn't think of a way to reliably do this with formail if there exists 
an escaped 'From ' line in the middle of the message (i.e. someone forwarding
an entire message with that line included, none of the other headers escaped).
But my perl "solution" would die on that too...

Most of the MIME attachments were confusing the digest mode, so the only
way that I could reliably get the right number of messages was using 
a header count of 9 for splitting:

   perl -e 'while (<STDIN>) {print $_ if ! 
/^>From\s+\S+\s+\S+\s+\S+\s+\d+\s+\d+:\d+:\d+\s+\d+/; }' < krb4.txt | formail 
-m 9 -des > krb4.mbox

But again, that would fail in the instance mentioned above.  I also didn't
check message content for validity, and it also recreates all of the 'From '
lines so you lose the original envelope information.

I'm sure someone has a better, more generalized solution, but this should
get you going on this specific mailbox.

Chris

<Prev in Thread] Current Thread [Next in Thread>