mharc-users

Re: Mharc 101 ...

2003-01-05 13:13:15
Hi Earl,

After sending my mail I did not stop searching and I found that
(misspeled in my original mail) .catch dir is the main src of the
problem. 

The issue is that as I said before I already have sorted by list or in
fact many other criterias raw mail files and I don't need to process it
by procmail again. In fact the only way it could be doable would be some
sort of translator from netscape filter file to lists.def 

So in my case lists.def is empty and I was hoping to get <list-name-x>
taken directly from the raw file name mharc is parsing.

I assume when you mean sorted, the mail is broken up into YYYY-MM chunks?

Well it would be great to put it in by date dirs. (So far just by using
mhonarc I had it all in one big folder). 

that each of my raw mail data file is seen as 1 mail msg.

This could be a MSGSEP resource issue.  Without knowing the specifics
of the steps you are doing and without seeing the input data, it
is just speculation at this point.

Most likely it is one msg as my lists.def is empty :). I was thinking
that you are using def mhonarc as the first phase and that it will cut
those raw files into separate html files ignoring any procmail
processing on the raw file. 

Thx,
R.

PS. Ideally the only piece I relly need out of mharc (which looks really
nice) would be to auto generate index.html based on already existing
mhonarc output files and generated also namazu indexes :) Is this
possible out of the box somehow or should I start thinking about my own
???




Earl Hood wrote:

On January 5, 2003 at 16:53, Robert Raszuk wrote:

All I am trying to use mharc for is to turn (one time only or maybe once
per month) 2.5 GB of raw netscape mail (mhonarc supports raw netscape
mail) into indexed and searchable html archive. Those raw files are
already sorted so I don't really need any procmail action or lists
file).

I assume when you mean sorted, the mail is broken up into YYYY-MM chunks?

All possible combinations with make rebuild failed. I got to some point
of gerating whole bunch of date files in .cache folders but it seems

I do not understand this statement and what ".cache" folders are.
mharc does not generate any ".cache" folders.

that each of my raw mail data file is seen as 1 mail msg.

This could be a MSGSEP resource issue.  Without knowing the specifics
of the steps you are doing and without seeing the input data, it
is just speculation at this point.

Also is there any description what files should
I expect to find in each folder cgi-bin (it was missing namazu.cgi
without any install errors),

Install does not fail on missing files like namazu.cgi since the file
may exist on the system, but install.pl could not find it.  Also,
mharc can be used with no searching capabilities.  A proper warning
is generated by install.pl for any files it does not find.

html, what raw file structure/formats are
supported under mbox etc ...

Under mbox/ the structure is:

  <mharc-root>/mbox/
      <list-name-1>/
          YYYY-MM
          YYYY-MM
          ...
      <list-name-2>/
          YYYY-MM
          YYYY-MM
          ...
      ...
      <list-name-N>/
          YYYY-MM
          YYYY-MM
          ...

Where YYYY-MM represents the period mailbox files for each list.  For
example, the mharc-users archive has:

  <mharc-root>/mbox/
      mharc-users/
          2002-07
          2002-08
          2002-09
          2002-10
          2002-11
          2002-12
          2003-01

One trick to importing existing mailbox data is to copy the mailbox
to

  <mharc-root>/.newmail

After you have properly defined <mharc-root>/lib/lists.def.

Then run:

  make readmail

If the mailbox is large, it may take some time.  You may want to
initially try it out with a small subset of the mailbox data first
to make sure things work as expected before processing all the data.

Another option is to use the mbox-month-pack script.

--ewh

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHARC-USERS

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHARC-USERS

<Prev in Thread] Current Thread [Next in Thread>