mhonarc-users

Re: Removing Mssgs., Inconsistency

2008-05-26 17:25:49
On May 5, 2008 at 18:19, Douglas Kline wrote:

I wanted to expunge an accumulation of spam,  When other attempts left the
links to messages in the index file and in the message files pointing to th
e
wrong files, I decided to reconstitute the archive from scratch.  I started
with an empty directory and ran mhonarc on all of the spooling-type files o
f
messages.  Then I recompiled the list of spam messages because I couldn't u
se
the previous list because the message numbers might be different.  Then I
converted the spam message file names to message numbers and ran 

FYI, message number consistency is a known limitation wrt to
rebuilding archives.  I.e.  If you rebuild an archive, but the
set of messages have changed from the original archive set, then
message numbers will not match.

mharc works around this problem by utilizing namazu's message-id
index to allow own to have "permanent" location for a message.

with the list of message numbers as arguments.

That removed those files.  So far as I can tell, the links in the message f
il
es
to other message files are now correct.  It also re-wrote the .mhonarc.db f
il
e.
So that part worked.

IIRC, you may want to run some tests on the latest version of mhonarc.
I do not know when, but some of the logic for tagging things to update
on message removal were improved (check NEWS file).

The index files (date1.html, date2.html, auth1.html, thrd1.html, etc.) stil
l
had references to the deleted spam messages.  So next I ran the command we 
ru
n
routinely to incorporate new messages with a dummy message to re-write the
indices.  The dummy message was necessary because if it doesn't find any ne
w
messages it won't act.  That worked too.

Have you tried -editidx?  It rewrites ALL archive pages.

Then I ran the scripts which compile the master indices (datedir.html,
authdir.html, thrddir.html) and that worked.

So what's the problem?  Some of the links in the message files to indices a
re
wrong.  They refer to non-existent date[0-9]*.html, auth[0-9]*.html, and
thrd[0-9]*.html files.  How can I fix that?

See comments above.  Later versions of mhonarc may fix this.

Also, -editidx should provide a brute-force way to correct the files.

--ewh


Thanks for the suggestions, ewh.  "-editidx" worked.  I found that unlike most
mhonarc operations this had to be run from the directory with the .html files
rather than referencing that directory with the "-outdir" option.  We will
follow up on your suggestion of looking into a more recent version of mhonarc.

Douglas Kline

========
Douglas M. Kline
kline(_at_)head(_dot_)cfa(_dot_)harvard(_dot_)edu

<Prev in Thread] Current Thread [Next in Thread>