I've been mulling over the problem of using marc-search to search
mhonarc archives that have been split into multiple directories. It
occured to me that the File::Find module in the standard perl5
distribution is exactly what's needed to handle the task of recursively
building up a list of files to search.
The problem is that I personally do not manage any archives that have
been broken down by months/years, and while I can (and will) certainly
create one (if only for testing purposes), it makes more sense to me
to inquire what kind of naming schemes people are already using.
My own inclination would be to give directories numerical names to
facilitate sorting, because File::Find works in such a way that it
will read directories and files inthe order you'd get by typing
ls -1. That is, I'd lean toward something like this:
The reason the sort order is relevant is that it corresponds to the
order files are searched and results are presented. In a single
directory archive, this is not a problem b/c mhonarc imposes strict,
immutable, and therefore predictable rules about filenames which imply
a date order: msg00001 precedes msg00002 and so forth.
Since humans are responsible for naming directories in multiple
directory archives, however, this predictability evaporates, which is
why I'd like to get a sense of what people *generally* do.
In a multiple directory archive, using alphabetical names for months
would pose a problem, since June would come after August. And while using
only 96 and 97 for years is fine for now, it will present problems
when 00 and 01 surface. The scheme I've described above would work
technically, but it's not very readable to humans (is this a concern?)
and it may not be in widespread use.
If anyone has thoughts/observations/experience to share on this subject,
I'd be interested in hearing from you, and I'm especially interested
in learning how mhonarc userswith a single archive split over
multiple directories have approached the task of organization.
private responses to friedman(_at_)uci(_dot_)edu are probably the most
appropriate way to go, though I will post a summary of the discussion
if there is one.
Eric D. Friedman