I've been writing a small Perl script that
scans MHonArc archives and HTMLizes the messages
bodies. I've been using Txt2Html as a library
to analyze the raw text and convert it to HTML.
The simplest messages are converted fine:
- paragraphs recognition
- short lines breaks
Sometimes the results are not probant:
- quotations not always detected
- ordered lists mixed
Does anybody have some experience of such post-
processing or have already tried to preprocess
mailboxes before invoking MHonArc?