mhonarc-users

Re: msgid instead of seq. number for output files

1998-07-31 05:11:51
Earl Hood wrote:
On July 30, 1998 at 14:25, Achim Bohnet wrote:

I've got a lot of messages in the archive that contain an
URL to other message in the archive (or others here).  Regrouping,
remove spam etc, would change the msg number used :-(   Msg-Id are
also much easier to use to do lookup of messages between the (sub)
archives of a mailing list.

Yep.

Note, I am confused anout some of what you said.  How would message
numbers get changed?  The only way this can happen is if you
recreate the archive from scratch.  And if this is what is done,
there should be no problems?  Can you elaborate?

What I want to do:

   o rebuild all the archives so they get a consistent layout.

   o remove spam

   o my procmail recipes where to lax at times so I have to filter
     out other some messages

   o Some list are so small that an archive once per month is an
     overkill so I would like to group them to bigger chunk.

Everything besides the first item changes the number of a message
and therefore the URL.

But wait I just realized that I jumpedto fast into an implementation
issue.  What I would really like to have is

  o ARI (Archive Msg Identifier)
  o extention to mhonarc to use the $w3archive/$ARI instead of
    $w3archive/my-substruct/{msg*.html,ext*.ext}
  o ARI -> filename  translation table

With this I can serve the archive via a CGI script that does

        $file = pathinfo2ARI;
        print HTMLheaderfor($file)
        print `cat $file`;

So maybe a

        mhonarc -ari ...

would use a 'virtual' URL and write an additional .mhonarc.ari
file.  Everything else could be done by external programms. It
would be their responsibility to ensure that accessing the 'virtual'
URL really return the right file.

Achim

So before I put it on top of my TODO list ...

Has anybody tried to patch mhonarc to use the msgid for the name of
file instead of the sequential numbering?  So files

   $msgid.html
   $msgid.$part.$ext

instead of

   msg$num.html
   $ext$num.$ext

? 

It is not easy.  The use of message numbers permeates throught the
code.  Ie.  A change in how filenames are generated  requires modification
in multiple locations.  It also will have an impact on certain features:

    o        NOSORT
    o        Database recovery (something I have implemented for the next
     release).  With the current scheme, it is easy to determine
     which files are message files to extract information from.
    o        Probably other stuff


Earl, does it make sense to make this bigger extention for mhonarc v2?

I'll look into it, but I do not know if it is an easy task.  MHonArc
is a good example of something that has grown beyond its original
design.

Or should one better wait and hack an alpha of mhonarc v3 (whenever
this will be).

Yes.  MHonArc v3 development has been extremely slow due to continuing
work on v2 (along with doing work that pays the bills).

Ah, before I get asked: If no message-id is given mhonarc should/will
create it's own id on the fly as it does now.

This will happen in the next release.  This is done to support
annotations (which, ironically, use message-ids as filenames).

     --ewh

----
             Earl Hood              | University of California: Irvine
      ehood(_at_)medusa(_dot_)acs(_dot_)uci(_dot_)edu      |      Electronic 
Loiterer
http://www.oac.uci.edu/indiv/ehood/ | Dabbler of SGML/WWW/Perl/MIME