mharc-users

mharc and mailman comments (was Re: file permissions)

2003-09-19 13:55:27
On September 19, 2003 at 10:45, Cheryl Trooskin wrote:

If you have the desire to read 76k worth of "here's what I did", it's at
http://www.byz.org/~sev/Tech/integrating-mharc-and-mailman.html

Nice write up, and informative.  My comments:

* The formail path problem you mention is a bug.  It is now
  bug #5437:
  http://savannah.nongnu.org/bugs/?func=detailbug&bug_id=5437&group_id=1968

* The answer to your $time_fmt comment is that the catch archive is
  out of the main loop, so $time_fmt does not contain an applicable
  value, hence the hard-coded value.  However, since one can
  technically define lists.def options for the catch archive, it
  would be reasonable to check if there is a period setting.

  It is now bug #5439:
  http://savannah.nongnu.org/bugs/?func=detailbug&bug_id=5439&group_id=1968

* I wish the procmail maintainers would fix version 3.22 and
  piping-out-of-memory problem.  I guess I need to consider if applying
  some of the work-arounds should be put into mharc, as you have done
  with yout mk-procmailrc patch.

  Our esteemed list owner is fairly versed in procmail, so I may
  solicit his comments on the manner.

* Re: mhonarc-specific namazu changes: If you use namazu 2.0.12, or
  later, it will contain the mhonarc.pl filter I made changes to.

  Also, I think there is something with older versions of namazu that
  may cause problems (i.e. unexpected crashes during index creation
  -- a reason I had to add namazu_cleanup() in web-archive).  Therefore,
  users should upgrade to 2.0.12, or later, if possible.

  I do not know if it was ever determined what caused mknmz to
  fail, but it was pain since search indexes would fail to get updated.
  I have not encountered this problem with namazu 2.0.12.

* Re: Note about perl versions: Perl 5.8.0 appears to be fairly
  stable.  The mhonarc.org archives use that version.

* Re: Note for NetBSD Package System users: I'm not familiar with
  how NetBSD packages applications, so it would be nice if they at
  least provided an "unstable" package (similiar to how Debian
  does things) for those that need a newer version.

  Since MHonArc development and releases are done under RedHat linux,
  I now provide RPM releases directly.  If there is a way to build
  other packages for other OSs under linux, I will consider providing
  alternate packages directly.

* Re: Populating lib/lists.def: It appears to not be clear in your
  comments, but mk-procmailrc already does key off List-Post when
  setting addresses via the Address: lists.def option.  I also have
  reservation about List-Id since it does not use an email address
  format, and I have yet to see an RFC that formally defines List-Id.

  Your comments about cross-posting must be specific to mailman
  since mharc is designed to handle it (the reason for the ":c"
  in procmail recipies).  However, if you are refering to the case
  where a message could have different headers add to it (like List-*
  headers) when cross-posting, then mharc currently does not handle
  that as one would desire.  The first message that comes in will win,
  therefore, an archive for one list may have a message with List-*
  headers of another list.

  Therefore, I can see a reason for disabling the message-id cache.
  Maybe I can make this configurable option.  Therefore, for your
  usage case, you can easily disable the cache and then use
  the following in lists.def for matching a given list:

    Name: listname
    Procmail-Condition: * ^List-Post:.*:listname@
    Final: 1

  Adding the Final is mainly a performance gain, but if not present,
  it just means that each message will always be tested against all
  list matching rules.  I think your patch to drop the ":c" is
  not needed since you can achieve the same effect via lists.def.

* I'm not familiar with mailman's "bin/list_lists -b", but it may be
  possible to write a simple Perl script that auto-generates a template
  mharc lists.def file based on its output.  A more robust script would
  allow one to update lists.def as new lists are created by mailman.

  If there is some documentation about bin/list_lists somewhere,
  I can assist in such a script of there is desire to create one.

* Your desire to put mbox files under the html directory was requested
  by someone else some time back, I believe.  I believe that user
  found an alternative acceptable solution.

  I think the idea of having more flexibility in the placement of
  mbox files (and even html archives) on a per list basis can be
  a useful, and very powerful, feature.  Such a feature could be
  driven by lists.def.  For example:

    Name: listname
    Mbox-Archive-Path: /path/to/directory/for/listname/mbox/files
    Html-Archive-Path: /path/to/directory/for/listname/html/archives
    ...

  Such changes would require changes to mk-procmailrc so procmail
  rules are defined properly, and there must be changes to web-archive
  (especially when doing file deletions on a rebuild).

* Couldn't your symlinks to the mbox files be more like?:

    $HTML_DIR/listname/mbox -> $MBOX_DIR/listname

  I.e. In Perl code:

    symlink("$MBOX_DIR/$list","$htmldir/mbox")

  Wouldn't this disable the clunky URLs?:

    http://example.com/mailman/private/listname/mbox/2003-09

* I'm unsure if all the file permission stuff can become part of mharc,
  but I have personally encountered a case where it would be nice to
  tell mharc what permissions to use (at least something like UMASK
  in mhonarc).  I will considering adding something, but it may not
  be extensive enough for your needs.

* Re: Stylesheet ...:  You may want to use the Tip mentioned in
  the mharc installation notes about how to handle custom layout
  changes.  The Tip is mentioned under "Archive Customizations" in
  the installation doc.  I am considering making the Tip a default
  behavior of mharc.  I.e. Change the MHA_MRC config.sh variable to
  default to $SW_ROOT/lib/default.mrc, with the guarantee that mharc
  will never make changes to it, only to common.mrc.

  This method makes dealing with mharc upgrades much easier and
  it isolates local customizations from mharc defaults.

  Therefore, all of editing resource recommendations would be done
  in default.mrc.in and not in common.mrc.in.

* Your web-archive patches to support author and subject indexes
  in top indexes is something I considered.  I.e.  Some more
  generalized, MHonArc-ish way to customize top (and the all-lists)
  index pages.

  I should note that namazu makes the use of author and subject indexes
  not that important (and I personally have found little use for them).
  First, you will notice that the default navigation layout of message
  pages provides quick links to get a list of other messages from
  the same author and to get a list of messages with the same subject.
  Of course, the usefulness of the author link may be minimal if
  employing address obfsucation techniques.

  From a general searching perspective, you can do the following
  to get a list of messages written by a certain author:

    +from:author-name

  Or, for a given subject:

    +subject:subject-text

  You may need to add quotes around author-name and subject-text if
  there are spaces.

  Namazu's field based searching is one of the reasons I went with
  namazu when developing mharc (along with its easier configuraton
  and direct support for mhonarc files).

  I have considered defining a configuration directive that allows
  you to tell mharc which indexes an archive has.

* The mbox-month-pack command does support a -msgsep option (it
  appears I failed to document it).  For example:

    mbox-month-pach -msgsep '^From \S+.*\d+:\d+:\d+' ...

  This can probably be used to avoid any pre-process steps.  You should
  also look at the MSGSEP resource setting in common.mrc.


Nice write-up.

There appears to be an implication that the mailman developers are
somewhat resistant to contributed changes.  Is this your impression?

--ewh

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHARC-USERS

<Prev in Thread] Current Thread [Next in Thread>