[mharc Top] mharc Installation

This document describes how to install and configure mharc on your system.
It is highly recommended to read through this document in its entirety
before performing any of the installation steps.

Table of Contents

  * Where to Get Help
  * Nomenclature
  * Audience
  * Introduction
  * Dependencies
      + Platforms
      + Software
  * Extracting mharc
  * Configuring mharc
  * Maintenance Operations
  * Post Install Checks
  * Archive Customizations
  * Applying Software Updates
  * MH/nmh Support
  * Managing List Administration Messages
      + mharc-based Solution
      + Procmail Solution
  * Limitations

---------------------------------------------------------------------------

 Where to Get Help

If you have problems installing mharc, a mailing list exists for mharc
users to provide a forum for help and general discussion. Information about
the list, and how to subscribe, is provided in Contacts.

If you want professional help to setup your list archives, please send a
message to mhonarc@mhonarc.org with your request.

---------------------------------------------------------------------------

 Nomenclature

Shell examples are rendered as follows:

prompt> ls -CF
bin/      etc/        images/  log/      msgid.cache       README
cgi-bin/  .htaccess@  info/    Makefile  NEWS              TODO
COPYING   html/       lib/     mbox/     procmailrc.mharc  VERSION

The text prompt> represents your shell prompt. Text you would type into the
shell is rendered like this: text you enter. Any other text is example
output generated by the computer.

---------------------------------------------------------------------------

 Audience

This document's intended audience is people familiar with working in
Unix-type environments. The following prerequisite knowledge is beneficial:

  * Know how to start and use a command-line shell.
   
  * Experienced using anyone of the myriad of text editors available on
    your system. Vi and emacs tend to be the most common, but some of the
    GUI environments provide some text editors.
   
  * Familiar with cron, a daemon to execute scheduled commands. You should
    know how to register crontab entries. If you are not familar with cron,
    see the crontab(1) and crontab(5) manpages.
   
---------------------------------------------------------------------------

 Introduction

mharc is designed to be independent of the mailing list management
software. Therefore, if changes are made to list management software, mharc
is unaffected. Also, it allows a division of labor on who manages the lists
and who manages the archives.

The typical usage model for mharc is to have a special archive user account
to do all mharc-related duties. This account will be subscribed to all
mailing lists you want archived. For example, say the account you will use
is "mailarch" and the mail address for the account is
"mailarch@example.com". For each list you want archived, you will subscribe
the address "mailarch@example.com" to each list.

mharc stays independent of the list software, since it acts like any other
subscriber. Once mharc is installed and configured, it will process all
incoming mail and filter the mail into the various archives you have
defined.

NOTE: Tips on how to handle list administration messages, like subscribe   
      confirmations, is provided in Managing List Administration Messages  

The rest of this document assumes this type of usage model. However, due to
the configurable nature of mharc, alternate usage models are possible.

---------------------------------------------------------------------------

 Dependencies

Platforms

mharc should run on any Unix-like operating system. If using a Win32
system, you will probably need to install Cygwin.

Software

mharc requires the following software:

  * MHonArc
    <http://www.mhonarc.org/>:
   
    MHonArc is used to convert messages into HTML and provide the periodic
    date and thread indexes. It also allows customization of archive page
    layout. v2.5.12, or later, is recommended.
   
  * Procmail
    <http://www.procmail.org/>:
   
    Procmail is used for the pre-filtering of incoming mail into the raw
    mail archives. Note, the program lockfile is also needed, which is part
    of the standard procmail distribution, but some Unix distributions may
    include it in a separate package.
   
  * Namazu
    <http://www.namazu.org/>:
   
    Namazu is used to provide searching. mharc also takes advantage of
    Namazu's awareness of MHonArc message pages to provide some useful
    archive navigational aids.
   
  * Perl
    <http://www.perl.com/>:
   
    mharc scripts are written in Perl. MHonArc also requires Perl.
   
  * make:
   
    The make program is not strictly required, but the master Makefile
    provides a convenient interface to invoking the various mharc scripts.
    GNU make is recommended, but other variations should work also. make is
    generally provided on all Unix-like distributions.
   
    NOTE: Some systems may have installed GNU make as gmake. If this is the
          case, anytime make is referenced in this document, replace it    
          with gmake.                                                      
   
To quickly check if you have these packages installed, enter the following
command at your shell prompt:

prompt> which perl mhonarc namazu procmail make
/usr/local/bin/perl
/usr/local/bin/mhonarc
/usr/local/bin/namazu
/usr/bin/procmail
/usr/bin/make

If the command returns a negative response for some of the programs, it
does not necessarily indicate that the given program is not installed. It
just indicates that it is not located in your shell's search path. Good
places to look are in /usr/local/bin and /usr/bin.

NOTE: Under Solaris, make is typically located in /usr/ccs/bin. Also, if   
      the optional software CD is installed, many free software packages   
      will be installed in /opt/sfw/bin, including GNU make, which is      
      called gmake.                                                        

If you cannot locate any of the above programs and are not sure what is
installed on your system, contact your system administrator (and while your
at it, you may want to ask your sys admin to install this software for you
:-).

If you know that a given package is not installed, follow the instructions
provided for the given package to install it.

---------------------------------------------------------------------------

 Extracting mharc

After MHonArc, Namazu, and Procmail have been installed, you can extract
the tar bundle wherever you want the software installed, which you probably
already did if your reading this file. If you extracted the bundle into a
temporary location, you can re-extract to your prefered location.

NOTE: Usually, the software is executed by an archive user account that is 
      subscribed to the lists that you want archived. It is recommended to 
      be logged into that account when installing this software.           

CAUTION: mharc should not operate under a priviledged user account, like   
         root, since it may open up potential security vulnerabilities.    

---------------------------------------------------------------------------

 Configuring mharc

After you extract the tar bundle, run the following command:

prompt> make configure
./bin/apply-config -verbose
/bin/cp common.mrc.in.dist common.mrc.in
/bin/chmod u+w common.mrc.in
Processing "./bin/../dist/mharc/lib/common.mrc.in"
...

Then, you should edit lib/config.sh to suit your local settings and rerun

prompt> make configure

again to apply your changes.

NOTE: Make sure to review all variable settings in lib/config.sh. Proper   
      values are critical for the archiving system to work properly.       

Now, edit lib/lists.def to define the mailing lists you want archived.
Syntax of the file is documented in the mk-procmailrc manpage. After
editing, run the following command:

prompt> make

This should generate a filed called procmailrc.mharc that will do the
initial filtering of mail. At anytime, if you edit lib/lists.def, you can
rerun "make" to regenerate the procmailrc.mharc file to reflect your
changes.

The final step is to edit the archive user account crontab to register the
mail archiving scripts to cron inorder to get automatic processing of your
archives. The file etc/crontab can serve as a template of the crontab
entries you should add. Generally, you can copy etc/crontab verbatim into
the crontab for your archive user account. Otherwise, you can edit etc/
crontab.in and run

prompt> make configure

to create an etc/crontab file suitable for copying into your real crontab.

---------------------------------------------------------------------------

 Maintenance Operations

Manual maintenance can be done via the Makefile provided. If you run the
command,

prompt> make help
Targets available:
  (default)     Generate procmailrc.mharc from ./lib/lists.def.
  configure:    Apply ./lib/config.sh settings.
  disable:      Disable automated processing of new messages.
  editidx:      Edit all mhonarc archive pages.
  editidxonly:  Edit only mhonarc archive index pages.
  editrootidx:  Edit only top period index pages.
  enable:       Enable automated processing of new messages.
  help:         This message.
  readmail:     Process mail spool.
  rebuild:      Rebuild archives from raw message data.
  rootidx:      Regenerated top index for archives.
...

You will get a summary of what targets are available. Targets exist to
manually invoke the mail spool processing, to recreate the entire HTML
archives, and other administrative tasks.

Some administrative tasks will disable auto-message processing, and a
message should be displayed when this happens. You can run:

prompt> make enable
=============================================================
!!! Auto-archive processing is ENABLED !!!
=============================================================

to re-enable auto-message processing.

---------------------------------------------------------------------------

 Post Install Checks

  * The Perl scripts contained in mharc assume the perl executable is
    located at /usr/local/bin/perl. If perl is installed, but in a
    different location, you can create a symbolic link from /usr/local/bin/
    perl to the installed location of the perl executable. If you do not
    have permissions to do this, you will need to change the initial #!
    line in all the Perl scripts to reflect the location of perl.
   
  * mharc calls MHonArc via its library API, therefore make sure that the
    MHonArc library files are located in perl's library search path if you
    chose a different directory to install MHonArc library files from
    perl's site library location. You may need to set the PERL5LIB
    environment variable to do this.
   
    NOTE: The "make configure" command mentioned earlier will automatically
          check if MHonArc can be loaded. If not, the command will generate
          an error message indicating what you can do to fix the problem.  
   
  * Double check the URL to the namazu.cgi program. A useful tip is to copy
    the namazu.cgi program into the cgi-bin of the mharc installation.
   
  * The file etc/apache.conf provides sample configuration directives for
    the Apache HTTP server that may be useful. If the default settings are
    not sufficient for your needs, you can edit etc/apache.conf.in and run
   
    prompt> make configure
   
    to generate a new etc/apache.conf that can be used in your Apache
    server configuration files.
   
    If you are on a system where you do not have access the Apache server
    configurations, a etc/.htaccess can be used to provide local
    configuration settings.
   
    To use this file, copy the generated etc/.htaccess file to the root of
    the installation when "make configure" is done, or create a symlink to
    it by executing the following command from the installation root:
   
    prompt> ln -s ./etc/.htaccess
   
    This way, you do not have to re-copy each time you make changes to this
    file.
   
  * Make sure your HTTP server allows the execution of CGI programs that
    are denoted with the filename extension ".cgi", or specify that cgi-bin
    directory of the mharc installation is a CGI executable directory.
   
    NOTE: If are using the .htaccess method to control access to your      
          files, you may need to create a .htaccess in the cgi-bin         
          directory with the following setting:                            
                                                                           
          Options +ExecCGI                                                 
   
---------------------------------------------------------------------------

 Archive Customizations

Most of the files that control the appearance of the archive pages
generated are controled by template files with the extension ".in". It is
recommended to edit the ".in" version of files and execute the "make
configure" command to apply your changes.

NOTE: You must run                                                    
                                                                      
      prompt> make configure                                          
                                                                      
      to have mharc recognize any changes you made to a template file.

The ".in.dist" files are versions of the templates as defined by the base
distribution. These will be overwritten when updating the software and
mainly serve as a starting basis for your custom template files. If you
ever you want to revert back to the ".in.dist" version of a file, just
remove the ".in" version and rerun

prompt> make configure

The main MHonArc resource file is lib/common.mrc. To make changes, make
edits to lib/common.mrc.in and run

prompt> make configure

to generate a new lib/common.mrc. You can use @@VARIABLE_NAME@@ references
in lib/common.mrc.in to refer to variables defined in lib/config.sh.
However, this is normally not required since the bin/web-archive program
will pre-define various MHonArc resource variables that reflect settings in
lib/config.sh. See the MHonArc documentation for more information on how to
edit MHonArc resource files.

TIP: To help ease the maintenance of MHonArc resource settings, especially 
     during mharc upgrades, set the MHA_RC variable in lib/config.sh to    
     something like the following:                                         
                                                                           
     # Pathname to main MHonArc resource file.                             
     MHA_RC=$SW_ROOT/lib/default.mrc                                       
                                                                           
     Then create the file $SW_ROOT/lib/default.mrc.in (make note that the  
     file ends with a ".in" extension) with the following contents:        
                                                                           
     <Include>                                                             
     @@SW_ROOT@@/lib/common.mrc                                            
     </Include>                                                            
                                                                           
     <!-- ... customized settings here ... -->                             
                                                                           
     And run                                                               
                                                                           
     prompt> make configure                                                
                                                                           
     Anytime you want to make any changes, make sure to edit $SW_ROOT/lib/ 
     default.mrc.in and rerun                                              
                                                                           
     prompt> make configure                                                
                                                                           
     Now, anytime you upgrade mharc, and mharc contains a new, improved lib
     /common.mrc.in.dist, and you want the new settings to be applied to   
     your archives, you just need to remove lib/common.mrc.in and run      
                                                                           
     prompt> make configure                                                
                                                                           
     lib/default.mrc are left untouched.                                   
                                                                           
     Doing the above will avoid having to do any messing merging of changes
     in a new lib/common.mrc.in.dist to your customized version of lib/    
     common.mrc.in.                                                        

---------------------------------------------------------------------------

 Applying Software Updates

The software is structured to avoid screwing up an existing installation.
All you need to do is extract the newer version of the bundle in the same
location of the initial installation. All the ".in.dist" files will get
overwritten, but any ".in" files should be left untouched inorder to
preserver any local edits.

TIP: If you ever you want to use a new, or revert back to, a ".in.dist"    
     version of a file, just remove the ".in" version and rerun            
                                                                           
     prompt> make configure                                                

---------------------------------------------------------------------------

 MH/nmh Support

A program called mh-month-pack is provided with mharc that could be used
for sites that already have an existing MH/nmh-based mail filtering setup
(either done manually or automatically). This program can be used to import
older mail into mharc or to serve as a replacement to the filter-spool step
if you want to stick with using MH/nmh to handle incoming mail.

If you do this, you will have to modify etc/crontab.in to no longer use 
read-mail, but to call mh-month-pack (or some custom script that uses 
mh-month-pack) followed by a call to web-archive.

TIP: You may just want to create variant version of read-mail that calls   
     mh-month-pack instead of filter-spool. Make sure to call your version 
     something different than read-mail because it will get overridden     
     during mharc upgrades.                                                

---------------------------------------------------------------------------

 Managing List Administration Messages

Most mailing list management software sends out administration message
users. Examples are subscribe confirmations and subscribe reminders. This
could be a potential problem since there is a risk that such messages could
show up in the regular archives, exposing sensitive information like
subscription passwords.

mharc-based Solution

One possible mharc-based solution is to create a special archive to capture
such messages. For example, the following can be added to the very
beginning of lib/lists.def:

Name: .listsadmin
Description: Lists Admin Messages
From-Address: majordomo@
From-Address: mailman-owner@
From-Address: .*-request@
From-Address: .*-help@
Final: 1

This must occur at the beginning of the file since the filtering rules are
processed from top to bottom. Since the Final option is set, if any message
matches, no further processing is performed.

Since .listsadmin starts with a dot, it will be hidden from the
all-archives list. But since it is possible to for someone to backdoor to
it manually, it is best to restrict access to it via HTTP server
configuration (remember to restrict both the raw and html archives).

Now, all you have to do is check the .listsadmin occasionally to see if
anything important has been received, like subscription confirmations that
need to be responded to.

Procmail Solution

If procmail is your local delivery agent, you can pre-filter all incoming
mail before mharc ever sees it. You can create a .procmailrc file in the
archive user's home directory and add rules that forward all list admin
messages to a real person. The .procmailrc may look something like this:

:0
* (^From:(.*[^-a-zA-Z0-9_.])?(majordomo@|mailman-owner@|.*-request@|.*-help@))
! real-person@example.com
---------------------------------------------------------------------------

 Limitations

  * The archive search forms rely on some Javascript to pass around the
    Namazu index name since the namazu.cgi program currently does not
    provide any namazu template variable for the index name. Hopefully,
    this limitation of namazu will be removed in the future so the use of
    Javascript can be removed.
   
    If Javascript is disable, or not supported, in a web client, initial
    searches from an archive page will work, but trying to do another
    search from the results page will always return no hits.
   
---------------------------------------------------------------------------

$Date: 2002/09/15 03:36:42 $
mharc
Copyright  2002, Earl Hood, earl@earlhood.com

