mhonarc-users

Old listserv archives won't convert

2000-05-01 15:36:10

I'm trying to use Mhonarc to process a bunch of old 
Listserv archives into Web-based format. Working backwards
through the years, I've had really good luck, but just
hit a stumbling block: early 1996 and earlier. From 
1996-1999, the Listserv we used was set to archive weekly,
but prior to that it was monthly. I have no way of knowing
what else was changed at that time, but the weekly archives
are parsable, while the older ones take forever and lump
ALL the subjects into the SUBJECTNA variable. (If I'm only
doing a couple of thousand lines, I get the massive SUBJECTNA
line, but if I'm doing more, Mhonarc just takes over the machine
until I kill it.) 

I'm pre-processing the archives with a SED script to include
message separators (and using the exact same script on the
weekly vs. monthly scripts), but it's choking somewhere. 

Here's the beginning of one of the problematic archives

<START PROBLEM>
[ejray(_at_)www TEMP01124535]$ more TECHWR-L.LOG9512
From ???(_at_)??? Sun Jan 00 00:00:00 0000
Date:         Fri, 1 Dec 1995 16:07:00 +1100
Reply-To:     "Colleen Dancer (02) 333-1862" 
<DANCER(_dot_)COLLEEN(_at_)A2(_dot_)ABC(_dot_)NET(_dot_)AU>
Sender:       "Technical Writers List; for all Technical Communication issues"
              <TECHWR-L(_at_)OSUVM1(_dot_)BITNET>
From:         "Colleen Dancer (02) 333-1862" 
<DANCER(_dot_)COLLEEN(_at_)A2(_dot_)ABC(_dot_)NET(_dot_)AU>
Subject:      Re: "Proper use of commas in England?"

I agree that in Australia like England we don't use the serial comma
unless necessary to remove ambiguity.  However what will annoy your
audience far far more is the use of American spelling.  I know I detest
it in manuals that I buy in Australia.  I feel that if the product is
going to be sold in Australia / England they can use the Queen's
English. I would suggest that you can probably use your discretion for
the comma, but DEFINITELY use the correct spelling for the audience.

</START PROBLEM>

<START FUNCTIONAL>
[ejray(_at_)www TEMP30053907]$ more *B
From ???(_at_)??? Sun Jan 00 00:00:00 0000
Date:         Sat, 13 Jan 1996 18:59:51 -0500
Reply-To:     GFHayhoe(_at_)AOL(_dot_)COM
Sender:       "Technical Writers List; for all Technical Communication issues"
              <TECHWR-L(_at_)LISTSERV(_dot_)OKSTATE(_dot_)EDU>
From:         GFHayhoe(_at_)AOL(_dot_)COM
Subject:      Tip of the Day

Tiffany Haley asked whether the Microsoft Tip of the Day concept is
copyrighted. I believe it's part of the new Microsoft Office "look and feel"
that Microsoft is trying to promote throughout Windows products. Delrina's
new WinFax Pro for Windows 95 utilizes the Tip of the Day just like
Microsoft's suite.

--George Hayhoe (GFHayhoe(_at_)aol(_dot_)com)
From ???(_at_)??? Sun Jan 00 00:00:00 0000
Date:         Sat, 13 Jan 1996 18:59:55 -0500
Reply-To:     GFHayhoe(_at_)AOL(_dot_)COM
Sender:       "Technical Writers List; for all Technical Communication issues"
              <TECHWR-L(_at_)LISTSERV(_dot_)OKSTATE(_dot_)EDU>
From:         GFHayhoe(_at_)AOL(_dot_)COM
Subject:      Alan Cooper's _About_Face_

</START FUNCTIONAL>

As far as I can tell, they're identical in the significant ways. The Sender
is a .bitnet address in the non-functional ones, but ...

Any suggestions for troubleshooting this?

Version info:
MHonArc v2.4.5 (Perl 5.00503)
Linux www.raycomm.com 2.2.13-7mdk #1 Wed Sep 15 18:02:18 CEST 1999 i586 unknown

Any help would be appreciated!

Eric
ejray(_at_)raycomm(_dot_)com

<Prev in Thread] Current Thread [Next in Thread>