procmail
[Top] [All Lists]

Does anybody have a corpus of malformed email messages fixable by formail?

2014-02-01 07:54:08
Hi,

I have a Python script (https://github.com/mcepl/gg_scraper) where I need to
read possibly malformed mbox messages. I use subprocess.Popen() and
/usr/bin/formail to clean up them to be correct mbox messages (with
correct leading From line etc.). Now I try to run tests for my script on
Travis-CI, where I don't have installed formail. Actually, I learned now
that I can run apt-get install procmail in .travis.yml. But still, I
started to think whether I couldn’t fix my script to be purely Pythonic.
I know that

    msg = email.message_from_string(original_msg)
    print(msg.as_string(unixfrom=True))

works as a poor-man’s replacement for `formail -d`. Now, I would like to
know how reliable replacement it is. Does anybody have (or know about) a
corpus of poorly formatted messages which can be fixed by formail to
test upon it?

Thanks a lot for any reply,

Matěj
-- 
http://www.ceplovi.cz/matej/, Jabber: mcepl(_at_)ceplovi(_dot_)cz
GPG Finger: 89EF 4BC6 288A BF43 1BAB  25C3 E09F EF25 D964 84AC
 
I didn't attend the funeral, but I sent a nice letter saying
I approved of it.
      -- Mark Twain

Attachment: signature.asc
Description: This is a digitally signed message part

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail
<Prev in Thread] Current Thread [Next in Thread>
  • Does anybody have a corpus of malformed email messages fixable by formail?, Matěj Cepl <=