Is MHonARC for me?

2000-10-30 23:43:01
Hi all,

I am trying to build a private (my use only!) archive of all my
emails I kept over the last few years (going back to 1996).
These were from a number of accounts on a number of systems,
mostly stored in PINE/IMAP folders in UNIX systems.

I recently backed all these up onto multiple CDs for the time being.

Now I want to create some type of archive or index, which should allow
me to find any old email in reasonable time.

My requirements:
        a) The system should be able to filter out numerous duplicate
           emails. I know I could use formail -D for this. Does MHonARC
           have a native detector for duplicate emails? It would be nice
           if it could detect even duplicates that differ in their M-ID
           (eg remails etc)

        b) The main requirement is that the archive INDEX is accessible
           easily, so I guess it will have to reside on my server's HD
           The index should be able to be stored on CD-R (or RW) for
           mobility and backup purposes.

           Minimal indexing requirements:
                From, To, CC, Bcc,Sender and their X-equivs (real names and 
                Message-ID, References
                Attachment filenames

                Some form of free-text indexes for the body would be nice but 
                I guess it would
                        a) create a humongous amount of data
                        b) be difficult to implement w/o also indexing
                           too common words (be, to, etc)
                        c) would require some powerful searchengine to
                           provide a useful interface..

                If MHonARC provides this, or there is some other tool
                which could be integrated into MHonARC with reasonable
                effort, I would be interested in this..

        c) The archive (emails) itself can be stored on multiple CDs
           (CD-R or CD-RW). If the mails could be stored in a compressed
            format this would be OK with me too.

           It would be nice if the system could split the archive
           automatically by date (eg year/month, so they can be put
           on CD separately). Mails that were duplicated in different
           periods might need a link ?

        d) Multi-system (Unix, Windows) access to the archive is a must. This
           is why I think MHonARC might be the right tool, as HTML can be
           read by both systems.

        e) The archive shold be extensible, so that I can pipe new mails into

Is there maybe a better tool than MHonARC to do this?

Any advice on the above is appreciated

Mathias Körber                                  mathias(_at_)koerber(_dot_)org
Eifersucht ist eine Leidenschaft, die mit Eifer sucht, was Leiden schafft 

