This document describes how to install and configure mharc on your system. It is highly recommended to read through this document in its entirety before performing any of the installation steps.
Table of Contents
If you have problems installing mharc, a mailing list exists for mharc users to provide a forum for help and general discussion. Information about the list, and how to subscribe, is provided in Contacts.
If you want professional help to setup your list archives, please send a message to mhonarc@mhonarc.org with your request.
Shell examples are rendered as follows:
prompt> ls -CF bin/ etc/ images/ log/ msgid.cache README cgi-bin/ .htaccess@ info/ Makefile NEWS TODO COPYING html/ lib/ mbox/ procmailrc.mharc VERSION
The text prompt> represents your shell prompt. Text you would type into the shell is rendered like this: text you enter. Any other text is example output generated by the computer.
This document's intended audience is people familiar with working in Unix-type environments. The following prerequisite knowledge is beneficial:
Know how to start and use a command-line shell.
Familiar with cron, a daemon to execute scheduled commands. You should know how to register crontab entries. If you are not familar with cron, see the crontab(1) and crontab(5) manpages.
mharc is designed to be independent of the mailing list management software. Therefore, if changes are made to list management software, mharc is unaffected. Also, it allows a division of labor on who manages the lists and who manages the archives.
The typical usage model for mharc is to have a special archive user account to do all mharc-related duties. This account will be subscribed to all mailing lists you want archived. For example, say the account you will use is "mailarch" and the mail address for the account is "mailarch@example.com". For each list you want archived, you will subscribe the address "mailarch@example.com" to each list.
mharc stays independent of the list software, since it acts like any other subscriber. Once mharc is installed and configured, it will process all incoming mail and filter the mail into the various archives you have defined.
NOTE: | Tips on how to handle list administration messages, like subscribe confirmations, is provided in Managing List Administration Messages |
The rest of this document assumes this type of usage model. However, due to the configurable nature of mharc, alternate usage models are possible.
mharc should run on any Unix-like operating system. If using a Win32 system, you will probably need to install Cygwin.
mharc requires the following software:
MHonArc is used to convert messages into HTML and provide the periodic date and thread indexes. It also allows customization of archive page layout. v2.5.12, or later, is recommended.
Procmail is used for the pre-filtering of incoming mail into the raw mail archives. Note, the program lockfile is also needed, which is part of the standard procmail distribution, but some Unix distributions may include it in a separate package.
Namazu is used to provide searching. mharc also takes advantage of Namazu's awareness of MHonArc message pages to provide some useful archive navigational aids.
mharc scripts are written in Perl. MHonArc also requires Perl.
The make program is not strictly required, but the master Makefile provides a convenient interface to invoking the various mharc scripts. GNU make is recommended, but other variations should work also. make is generally provided on all Unix-like distributions.
NOTE: | Some systems may have installed GNU make as gmake. If this is the case, anytime make is referenced in this document, replace it with gmake. |
To quickly check if you have these packages installed, enter the following command at your shell prompt:
prompt> which perl mhonarc namazu procmail make /usr/local/bin/perl /usr/local/bin/mhonarc /usr/local/bin/namazu /usr/bin/procmail /usr/bin/make
If the command returns a negative response for some of the programs, it does not necessarily indicate that the given program is not installed. It just indicates that it is not located in your shell's search path. Good places to look are in /usr/local/bin and /usr/bin.
NOTE: | Under Solaris, make is typically located in /usr/ccs/bin. Also, if the optional software CD is installed, many free software packages will be installed in /opt/sfw/bin, including GNU make, which is called gmake. |
If you cannot locate any of the above programs and are not sure what is installed on your system, contact your system administrator (and while your at it, you may want to ask your sys admin to install this software for you :-).
If you know that a given package is not installed, follow the instructions provided for the given package to install it.
After MHonArc, Namazu, and Procmail have been installed, you can extract the tar bundle wherever you want the software installed, which you probably already did if your reading this file. If you extracted the bundle into a temporary location, you can re-extract to your prefered location.
NOTE: | Usually, the software is executed by an archive user account that is subscribed to the lists that you want archived. It is recommended to be logged into that account when installing this software. |
CAUTION: | mharc should not operate under a priviledged user account, like root, since it may open up potential security vulnerabilities. |
After you extract the tar bundle, run the following command:
prompt> make configure ./bin/apply-config -verbose /bin/cp common.mrc.in.dist common.mrc.in /bin/chmod u+w common.mrc.in Processing "./bin/../dist/mharc/lib/common.mrc.in" ...
Then, you should edit lib/config.sh to suit your local settings and rerun
prompt> make configure
again to apply your changes.
NOTE: | Make sure to review all variable settings in lib/config.sh. Proper values are critical for the archiving system to work properly. |
Now, edit lib/lists.def to define the mailing lists you want archived. Syntax of the file is documented in the mk-procmailrc manpage. After editing, run the following command:
prompt> make
This should generate a filed called procmailrc.mharc that will do the initial filtering of mail. At anytime, if you edit lib/lists.def, you can rerun “make” to regenerate the procmailrc.mharc file to reflect your changes.
The final step is to edit the archive user account crontab to register the mail archiving scripts to cron inorder to get automatic processing of your archives. The file etc/crontab can serve as a template of the crontab entries you should add. Generally, you can copy etc/crontab verbatim into the crontab for your archive user account. Otherwise, you can edit etc/crontab.in and run
prompt> make configure
to create an etc/crontab file suitable for copying into your real crontab.
Manual maintenance can be done via the Makefile provided. If you run the command,
prompt> make help Targets available: (default) Generate procmailrc.mharc from ./lib/lists.def. configure: Apply ./lib/config.sh settings. disable: Disable automated processing of new messages. editidx: Edit all mhonarc archive pages. editidxonly: Edit only mhonarc archive index pages. editrootidx: Edit only top period index pages. enable: Enable automated processing of new messages. help: This message. readmail: Process mail spool. rebuild: Rebuild archives from raw message data. rootidx: Regenerated top index for archives. ...
You will get a summary of what targets are available. Targets exist to manually invoke the mail spool processing, to recreate the entire HTML archives, and other administrative tasks.
Some administrative tasks will disable auto-message processing, and a message should be displayed when this happens. You can run:
prompt> make enable ============================================================= !!! Auto-archive processing is ENABLED !!! =============================================================
to re-enable auto-message processing.
The Perl scripts contained in mharc assume the perl executable is located at /usr/local/bin/perl. If perl is installed, but in a different location, you can create a symbolic link from /usr/local/bin/perl to the installed location of the perl executable. If you do not have permissions to do this, you will need to change the initial #! line in all the Perl scripts to reflect the location of perl.
mharc calls MHonArc via its library API, therefore make sure that the MHonArc library files are located in perl's library search path if you chose a different directory to install MHonArc library files from perl's site library location. You may need to set the PERL5LIB environment variable to do this.
NOTE: | The “make configure” command mentioned earlier will automatically check if MHonArc can be loaded. If not, the command will generate an error message indicating what you can do to fix the problem. |
Double check the URL to the namazu.cgi program. A useful tip is to copy the namazu.cgi program into the cgi-bin of the mharc installation.
The file etc/apache.conf provides sample configuration directives for the Apache HTTP server that may be useful. If the default settings are not sufficient for your needs, you can edit etc/apache.conf.in and run
prompt> make configure
to generate a new etc/apache.conf that can be used in your Apache server configuration files.
If you are on a system where you do not have access the Apache server configurations, a etc/.htaccess can be used to provide local configuration settings.
To use this file, copy the generated etc/.htaccess file to the root of the installation when “make configure” is done, or create a symlink to it by executing the following command from the installation root:
prompt> ln -s ./etc/.htaccess
This way, you do not have to re-copy each time you make changes to this file.
Make sure your HTTP server allows the execution of CGI programs that are denoted with the filename extension ".cgi", or specify that cgi-bin directory of the mharc installation is a CGI executable directory.
NOTE: | If are using the .htaccess method to control access to your files, you may need to create a .htaccess in the cgi-bin directory with the following setting: Options +ExecCGI |
Most of the files that control the appearance of the archive pages generated are controled by template files with the extension ".in". It is recommended to edit the ".in" version of files and execute the “make configure” command to apply your changes.
NOTE: | You must run prompt> make configure to have mharc recognize any changes you made to a template file. |
The ".in.dist" files are versions of the templates as defined by the base distribution. These will be overwritten when updating the software and mainly serve as a starting basis for your custom template files. If you ever you want to revert back to the ".in.dist" version of a file, just remove the ".in" version and rerun
prompt> make configure
The main MHonArc resource file is lib/common.mrc. To make changes, make edits to lib/common.mrc.in and run
prompt> make configure
to generate a new lib/common.mrc. You can use @@VARIABLE_NAME@@ references in lib/common.mrc.in to refer to variables defined in lib/config.sh. However, this is normally not required since the bin/web-archive program will pre-define various MHonArc resource variables that reflect settings in lib/config.sh. See the MHonArc documentation for more information on how to edit MHonArc resource files.
TIP: | To help ease the maintenance of MHonArc resource settings, especially during mharc upgrades, set the MHA_RC variable in lib/config.sh to something like the following: # Pathname to main MHonArc resource file. MHA_RC=$SW_ROOT/lib/default.mrc Then create the file $SW_ROOT/lib/default.mrc.in (make note that the file ends with a ".in" extension) with the following contents: <Include> @@SW_ROOT@@/lib/common.mrc </Include> <!-- ... customized settings here ... --> And run prompt> make configure Anytime you want to make any changes, make sure to edit $SW_ROOT/lib/default.mrc.in and rerun prompt> make configure Now, anytime you upgrade mharc, and mharc contains a new, improved lib/common.mrc.in.dist, and you want the new settings to be applied to your archives, you just need to remove lib/common.mrc.in and run prompt> make configure lib/default.mrc are left untouched. Doing the above will avoid having to do any messing merging of changes in a new lib/common.mrc.in.dist to your customized version of lib/common.mrc.in. |
The software is structured to avoid screwing up an existing installation. All you need to do is extract the newer version of the bundle in the same location of the initial installation. All the ".in.dist" files will get overwritten, but any ".in" files should be left untouched inorder to preserver any local edits.
TIP: | If you ever you want to use a new, or revert back to, a ".in.dist" version of a file, just remove the ".in" version and rerun prompt> make configure |
A program called mh-month-pack is provided with mharc that could be used for sites that already have an existing MH/nmh-based mail filtering setup (either done manually or automatically). This program can be used to import older mail into mharc or to serve as a replacement to the filter-spool step if you want to stick with using MH/nmh to handle incoming mail.
If you do this, you will have to modify etc/crontab.in to no longer use read-mail, but to call mh-month-pack (or some custom script that uses mh-month-pack) followed by a call to web-archive.
TIP: | You may just want to create variant version of read-mail that calls mh-month-pack instead of filter-spool. Make sure to call your version something different than read-mail because it will get overridden during mharc upgrades. |
Most mailing list management software sends out administration message users. Examples are subscribe confirmations and subscribe reminders. This could be a potential problem since there is a risk that such messages could show up in the regular archives, exposing sensitive information like subscription passwords.
One possible mharc-based solution is to create a special archive to capture such messages. For example, the following can be added to the very beginning of lib/lists.def:
Name: .listsadmin Description: Lists Admin Messages From-Address: majordomo@ From-Address: mailman-owner@ From-Address: .*-request@ From-Address: .*-help@ Final: 1
This must occur at the beginning of the file since the filtering rules are processed from top to bottom. Since the Final option is set, if any message matches, no further processing is performed.
Since .listsadmin starts with a dot, it will be hidden from the all-archives list. But since it is possible to for someone to backdoor to it manually, it is best to restrict access to it via HTTP server configuration (remember to restrict both the raw and html archives).
Now, all you have to do is check the .listsadmin occasionally to see if anything important has been received, like subscription confirmations that need to be responded to.
If procmail is your local delivery agent, you can pre-filter all incoming mail before mharc ever sees it. You can create a .procmailrc file in the archive user's home directory and add rules that forward all list admin messages to a real person. The .procmailrc may look something like this:
:0 * (^From:(.*[^-a-zA-Z0-9_.])?(majordomo@|mailman-owner@|.*-request@|.*-help@)) ! real-person@example.com
The archive search forms rely on some Javascript to pass around the Namazu index name since the namazu.cgi program currently does not provide any namazu template variable for the index name. Hopefully, this limitation of namazu will be removed in the future so the use of Javascript can be removed.
If Javascript is disable, or not supported, in a web client, initial searches from an archive page will work, but trying to do another search from the results page will always return no hits.