nmh-workers
[Top] [All Lists]

Re: [nmh-workers] INCing of email archives

2019-07-25 18:25:36
Once in a while I download email archives of some mailing list
and unpack them using "inc -file <archive-file>". But more
than once I have seen that inc gets confused and doesn't
unpack the whole thing. The cause seems to be a line starting
with From in some message body. Ideally inc should look that
a "From ..." line is immediately followed by header lines.
And if this is not the case, assume it is in the message body.

Ralph answered this, but let me expand a bit.

The job of inc(1) is to incorporate messages from a 'mail drop' into your
MH mailbox.  Traditionally it handles mbox-style files and POP (it also
does MMDF, but let us not speak of that).

As you can see from the Wikipedia entry Ralph linked to, all of the
various mbox formats use the same scheme: a line beginning with "From
" is the mailbox delimiter (mboxcl and mboxcl2 uses a Content-Length
header; I believe they are officially dead at this point).  The big
differences are in quoting rules.  Unfortunately since we're kind of
locked in to the mbox format in inc(1) at least, changing that would
have some nasty consequences (Ralph gave you an example of a message
that it would break on but I am sure there are others).  I think your
best bet is to preprocess these mailing list archives so they are valid
mbox files.

--Ken

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>