First -- a tremendous amount of thanks for this tip. I've been
struggling to figure out how to get my old Emails out of these Outlook
archives, and this trick works great!
One question -- when you talk about HTML formatting being lost by
Thunderbird, what type of formatting do you mean? The only things that
don't seem to translate well are HTML image attachments -- like spliced
images -- but I'm mostly getting that in Email advertising which I don't
care about so much. The rest of formatting, fonts, color, etc, seem to
come across fine.
I'm using Thunderbird 0.5 (your message indicates 0.4) -- did it change
that much between those versions?
Just curious -- thanks.
Side question -- are there any practical limits to the number of
messages MHonArc can convert into an archive? I've been testing in
batches of 350, and it only takes a few minutes on my laptop. But, if I
try to dump thousands of messages in there will I hit some limit of what
can be processed? Thanks in advance.
Lyle Schofield <mailto:lyle(_dot_)schofield(_at_)daou(_dot_)com>
Tel: (301) 929-7624
The information contained in this communication may be confidential, is
intended only for the use of the recipient(s) named above, and may be
legally privileged. If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution,
or copying of this communication, or any of its contents, is strictly
prohibited. If you have received this communication in error, please
return it to the sender immediately and delete any copy of it from your
[mailto:owner-mhonarc-users(_at_)mhonarc(_dot_)org] On Behalf Of Tim Gwinn
Sent: Friday, March 12, 2004 3:28 PM
To: MHonArc Users List
Subject: Creating a Mhonarc archive from MS Outlook emails
I wanted to share my tale of converting email from MS-Outlook into a
Mhonarc archive, in hopes that it may be useful to someone else. (As I
found out, being a Windows-based user can have its drawbacks.)
My situation was one of running a Listserv list hosted by Lsoft, and
after several months realizing that it would nice to have
Google-searchable web-based archives. Mhonarc seemed like a great
product, but the months of emails I had acquired were in Outlook 2000,
which uses a proprietary format, not the "mbox" format desired by
Mhonarc. Likewise, Listserv archives are not in mbox format. (The
logfiles can be converted, apparently, but by default, the logfiles do
not have full header information, and on top of that, require one to
write their own conversion program.)
After some online searching and experimenting, I was unable to find any
non-commercial utility to convert Outlook email files directly into mbox
format. (They only seemed to exist for Outlook Express -> mbox, such as
I found that Mozilla Thunderbird (build 0.4) used an mbox format email
file, and it allowed one to directly import existing email from Outlook
or Outlook Express. Woot!
However, I was disappointed to find that while the imported email from
Outlook retained its full header information, it lost any of its HTML
formatting in the message bodies.
A little more experimenting showed me that if I imported email from
Outlook to Outlook Express, I could then import from Outlook Express to
Thunderbird and it would retain all the HTML formatting in the emails.
But, apparently in the Outlook to OE step, much of the header
information was lost, such that Mhonarc could perform no message
threading, aside from matching the Subject.
So, both import methods had something the other didn't.
I then devised a nefarious kludge to combine the best of both worlds:
1) Import from the Outlook 2000 into Thunderbird, using Thunderbirds
2) Import from Outlook 2000 to Outlook Express using OE's import
3) Import from OE to Thunderbird into another folder, using Thunderbirds
4) Create two separate Mhonarc archives using the two mbox files from
the two imports into Thunderbird.
The OE archive would have all the HTML formatting in the msg files, but
the thread index files would not be correct. So, I simply copied the
date, thread and author index files from the Outlook2000 archive over
the ones in the OE archive. That resulted in archives that retained both
1) complete HTML formatting and 2) full threading information. Viola!
Subsequent to that initial transition, I have been receiving all new
emails directly into Thunderbird, and '-add'ing those emails in batches
to the existing archives. (Again, being Windows-based and not having
direct access to mbox files on my hosted website, I have to do this on
my PC, and then use Frontpage to shuttle the changed files to my web
Perhaps someone will tell me there is a much easier way, but at least
this worked and might be useful to someone else.
My archives are at: http://www.panmere.com/rosen/mhout/maillist.html
Mozilla Thiunderbird: http://www.mozilla.org/projects/thunderbird/