On September 19, 2003 at 10:45, Cheryl Trooskin wrote:
If you have the desire to read 76k worth of "here's what I did", it's at
http://www.byz.org/~sev/Tech/integrating-mharc-and-mailman.html
Nice write up, and informative. My comments:
* The formail path problem you mention is a bug. It is now
bug #5437:
http://savannah.nongnu.org/bugs/?func=detailbug&bug_id=5437&group_id=1968
* The answer to your $time_fmt comment is that the catch archive is
out of the main loop, so $time_fmt does not contain an applicable
value, hence the hard-coded value. However, since one can
technically define lists.def options for the catch archive, it
would be reasonable to check if there is a period setting.
It is now bug #5439:
http://savannah.nongnu.org/bugs/?func=detailbug&bug_id=5439&group_id=1968
* I wish the procmail maintainers would fix version 3.22 and
piping-out-of-memory problem. I guess I need to consider if applying
some of the work-arounds should be put into mharc, as you have done
with yout mk-procmailrc patch.
Our esteemed list owner is fairly versed in procmail, so I may
solicit his comments on the manner.
* Re: mhonarc-specific namazu changes: If you use namazu 2.0.12, or
later, it will contain the mhonarc.pl filter I made changes to.
Also, I think there is something with older versions of namazu that
may cause problems (i.e. unexpected crashes during index creation
-- a reason I had to add namazu_cleanup() in web-archive). Therefore,
users should upgrade to 2.0.12, or later, if possible.
I do not know if it was ever determined what caused mknmz to
fail, but it was pain since search indexes would fail to get updated.
I have not encountered this problem with namazu 2.0.12.
* Re: Note about perl versions: Perl 5.8.0 appears to be fairly
stable. The mhonarc.org archives use that version.
* Re: Note for NetBSD Package System users: I'm not familiar with
how NetBSD packages applications, so it would be nice if they at
least provided an "unstable" package (similiar to how Debian
does things) for those that need a newer version.
Since MHonArc development and releases are done under RedHat linux,
I now provide RPM releases directly. If there is a way to build
other packages for other OSs under linux, I will consider providing
alternate packages directly.
* Re: Populating lib/lists.def: It appears to not be clear in your
comments, but mk-procmailrc already does key off List-Post when
setting addresses via the Address: lists.def option. I also have
reservation about List-Id since it does not use an email address
format, and I have yet to see an RFC that formally defines List-Id.
Your comments about cross-posting must be specific to mailman
since mharc is designed to handle it (the reason for the ":c"
in procmail recipies). However, if you are refering to the case
where a message could have different headers add to it (like List-*
headers) when cross-posting, then mharc currently does not handle
that as one would desire. The first message that comes in will win,
therefore, an archive for one list may have a message with List-*
headers of another list.
Therefore, I can see a reason for disabling the message-id cache.
Maybe I can make this configurable option. Therefore, for your
usage case, you can easily disable the cache and then use
the following in lists.def for matching a given list:
Name: listname
Procmail-Condition: * ^List-Post:.*:listname@
Final: 1
Adding the Final is mainly a performance gain, but if not present,
it just means that each message will always be tested against all
list matching rules. I think your patch to drop the ":c" is
not needed since you can achieve the same effect via lists.def.
* I'm not familiar with mailman's "bin/list_lists -b", but it may be
possible to write a simple Perl script that auto-generates a template
mharc lists.def file based on its output. A more robust script would
allow one to update lists.def as new lists are created by mailman.
If there is some documentation about bin/list_lists somewhere,
I can assist in such a script of there is desire to create one.
* Your desire to put mbox files under the html directory was requested
by someone else some time back, I believe. I believe that user
found an alternative acceptable solution.
I think the idea of having more flexibility in the placement of
mbox files (and even html archives) on a per list basis can be
a useful, and very powerful, feature. Such a feature could be
driven by lists.def. For example:
Name: listname
Mbox-Archive-Path: /path/to/directory/for/listname/mbox/files
Html-Archive-Path: /path/to/directory/for/listname/html/archives
...
Such changes would require changes to mk-procmailrc so procmail
rules are defined properly, and there must be changes to web-archive
(especially when doing file deletions on a rebuild).
* Couldn't your symlinks to the mbox files be more like?:
$HTML_DIR/listname/mbox -> $MBOX_DIR/listname
I.e. In Perl code:
symlink("$MBOX_DIR/$list","$htmldir/mbox")
Wouldn't this disable the clunky URLs?:
http://example.com/mailman/private/listname/mbox/2003-09
* I'm unsure if all the file permission stuff can become part of mharc,
but I have personally encountered a case where it would be nice to
tell mharc what permissions to use (at least something like UMASK
in mhonarc). I will considering adding something, but it may not
be extensive enough for your needs.
* Re: Stylesheet ...: You may want to use the Tip mentioned in
the mharc installation notes about how to handle custom layout
changes. The Tip is mentioned under "Archive Customizations" in
the installation doc. I am considering making the Tip a default
behavior of mharc. I.e. Change the MHA_MRC config.sh variable to
default to $SW_ROOT/lib/default.mrc, with the guarantee that mharc
will never make changes to it, only to common.mrc.
This method makes dealing with mharc upgrades much easier and
it isolates local customizations from mharc defaults.
Therefore, all of editing resource recommendations would be done
in default.mrc.in and not in common.mrc.in.
* Your web-archive patches to support author and subject indexes
in top indexes is something I considered. I.e. Some more
generalized, MHonArc-ish way to customize top (and the all-lists)
index pages.
I should note that namazu makes the use of author and subject indexes
not that important (and I personally have found little use for them).
First, you will notice that the default navigation layout of message
pages provides quick links to get a list of other messages from
the same author and to get a list of messages with the same subject.
Of course, the usefulness of the author link may be minimal if
employing address obfsucation techniques.
From a general searching perspective, you can do the following
to get a list of messages written by a certain author:
+from:author-name
Or, for a given subject:
+subject:subject-text
You may need to add quotes around author-name and subject-text if
there are spaces.
Namazu's field based searching is one of the reasons I went with
namazu when developing mharc (along with its easier configuraton
and direct support for mhonarc files).
I have considered defining a configuration directive that allows
you to tell mharc which indexes an archive has.
* The mbox-month-pack command does support a -msgsep option (it
appears I failed to document it). For example:
mbox-month-pach -msgsep '^From \S+.*\d+:\d+:\d+' ...
This can probably be used to avoid any pre-process steps. You should
also look at the MSGSEP resource setting in common.mrc.
Nice write-up.
There appears to be an implication that the mailman developers are
somewhat resistant to contributed changes. Is this your impression?
--ewh
---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHARC-USERS