Re: Need to reindex

1996-05-10 03:45:25
Ben Combee said:
I just had a drive space crash last week, where I lost the maillist.html,
threads.html, and .mhonarc.db file for one of my archives.  I was not
saving the messages separately (at least not automatically), so while
I still have all the msg*.html files, I don't know how to recreate
the database so MHonarc will be able to pick them up again.

Any easy suggestions?  I am using MHonarc 1.11... I've not upgraded
because I didn't need the MIME support and wasn't sure how easy the
upgrade would be.

It's hard but not unmanageable (if you haven't customized some information
away).  Every converted message starts with HTML comments '<!-- ... --->'
that give you From, Date, Subject, Message-Id and Content-Type. So you
could reconstruct the information from there. You have to convert the
Date into seconds since Epoche and with the msg number you could construct
the index of the hashes stored in .mhonarc.db (Note: mhonarc normally uses
the Received Header so you may run into some problems due to undefined
Time Zones).

The more complicated tasks is threading. You have to extract
the references and in-reply-to headers. If you havn't customized them
away they are available as lists at the end.

Make a new .mhonarc.db with two or three messages and have a look
what mhonarc expects in mhonarc.db. To get the format right use the
routines in

If you manage to reconstruct the .mhonarc.db. I think it would be a
good idea to post your script here. May save some other mailing list
admins in the future some trouble.

Hope this helps a bit,
P.S. Earl: How about adding (as default) the list of references at the
     beginning <!--X-References: ref1 ... ... --> and 
     <!--X-Followups: ... -->.  A backup is of course
     better but it may also proof useful if one wants to merge a single
     already html converted mail into an archive or to construct a
     masterindex out of several indexes.

P.S.S. After a look at the code I think that the hask keys used can
       be replaced by the message-id. The 'unix-seconds'+'msg-number
       is harder to handle and the code already depends in some other
       places that the message id is unique.

<Prev in Thread] Current Thread [Next in Thread>