procmail
[Top] [All Lists]

formail greediness & digests

1997-07-18 17:41:00
I've been working on processing the digests I get from
various mailing lists, & I'm running into some problems.
One is minor, tho annoying; the other is really critical.

Critical one first.   I know this must be boring, because
I have seen this subject come up a few times in the past few
months (without my understanding why it was a problem til now).
But I'm stuck, & don't see how to get the existing tools to do
the job.

I've been using formail to burst digests, like so:

  FOLDER=whatever
  :0 wc
  | formail +1 -d -s procmail -m FOLDER=$FOLDER $HOME/.proc.digest

(why c flag?  I sort & do other things after bursting; using mh style
folders)

formail is WAY too greedy, tho.  It seems to split everything in
the digest into its minutest parts.   This is ok around 95% of the
time but when someone forwards another message inside their own,
their mailer constructs a mime conglomeration, or even possibly when
a correspondent puts the wrong magic symbols in their message, this
processing will result in the digest folder having disconnected garbage
in it.  This disconnection is
of course exacerbated by the subsequent post processing & sorting which
can remove the seriality link between these broken messages.  Often the
message fragments wind up in my UNIX mailbox (from "foo(_at_)bar") when
formail or procmail gets wigged by it, & sometimes discarded altogether
if the post processing includes eg some spam filtering.  Boo!

In the past I've used the old mh burst, but that's junk.   It can easily
get confused by strings of "--" in messages & also make a mess.  It's
also useless for some of the digests that exist today, such as those
created by listproc or others.

I don't remember the recommendations made about this problem with formail
too well, but it seems to me they came down to playing with the -d & -s
options.  I can't see a way to make a general solution, tho.  

What can I do?  Any ideas?  Basically, I want a digest processor that
runs only one level and that's all, I think.

Second problem, cosmetic.  Using formail to burst digests has a side effect
on the message contents: a digest "stain" is left (unlike with, say,
burst).  The end of the individual messages will have (bye-bye, digest
readers)

All the best

Jim C

------------------------------


or



----__ListProc__NextPart____FNORD-STUDIES__digest_272

Is there a way to persuade formail to leave these out?   Using sed to
strip them out of the bodies is another possibility, but sed is simple
& doesn't know to look only at the end (but *not* necessarily the last line)
of messages.   On the other hand, leaving this adulteration in messages
opens up the possibility that one's reply to the digest will cause a problem
in the next round of digest bursting.  Yes, I've had the pleasure.




<Prev in Thread] Current Thread [Next in Thread>