procmail
[Top] [All Lists]

Re: inconsistent routine to grab snippet of email

2003-01-14 10:41:52
mark david mcCreary <mdm(_at_)internet-tools(_dot_)com>, who seems to eschew 
capital
letters, wrote:

I could use some insight and tips into the following (mostly working)
snippet of code.  It's designed to find email messages that have only a
period on a line in the body of the message.

After finding that, it also shows the surrounding text with a grep -C4
command.  However, that same grep -C4 recipe also bombs out sometimes, and
while the error message says the rescue succeeded, in fact the recipe dies.

The recipe works most of the time.  My first guess is that tmp.request is
being overwritten by another simultaneous email being processed.

Does anybody have any insights on the error messsage, or a better way of
accomplishing this, so that the grep is not done on a file, but rather on
the mail buffer within procmail.

It's easier and "cheaper" to do this inside of procmail rather than
resorting to grep.  If you really do want to use grep, well, the
approach and the flags can be improved.  But first, what is $grep
set to?  Unless you have a special reason not to, preferred is to
use the compiled-in path in procmail to choose your executables
for you.

For a task where you don't want regex interpretation, the
preferred grep is fgrep.  Moreover, your

        grep -C4 "^\.$" 

is better handled with flags to grep:

        fgrep -x -C4 .

or maybe even

        fgrep -x -C4 -e .


You certainly don't want to use a generic tempfile name such as 
your "tmp.request".  At the least, call it "$$.tmp.request", so
that other procmail iterations won't run over it.


:0 fhw HB
* ! ^X-Diagnostic:
* ^Content-type:.*text/plain
* ^\.$
| $formail -A "X-Diagnostic: message had line with only period"


Are you wanting to check the headers, too, for just a dot?  It
looks to me like that's what's happening here.

How about:

        :0 fhw
        * ! ^X-Diagnostic:
        * ^Content-type:.*text/plain
        * B ?? ^\.$
        | formail -A "X-Diagnostic: message had line with only period"


By the way, what if the file has mor than one line with only a dot?
Do you want what's around all of them, or just the first?


:0 afhw
| $cat - ;  $cat error_line_with_only_dot.txt

I don't know what you're up to here, but if you're getting errors
with cat, then you must be looking at binary files, right?  Which is
weird, considering this is mail and all.  But if there are control
chars in there, I think you'd want the `v' flag for cat.  I still
don't see the point of all this, though.  See below.


:0 afhw
| $cat - ;  $grep -C4 "^\.$" tmp.request


Why did you save to a file in the first place?  And where did
you do that?  It's not in the part you submitted.  But why not
just forget the temp file and use

        :0 Ac  # Append to X-Diag recipe with `A'0flag
        | fgrep -x -C4 -e .


Back to doing it all in procmail:


        :0 fwh
        * your preconditions here
        * B  ??  ^.$
        * HB ??  ^\/.*$.*$.$.*$.*$.*
        | formail -A "X-Diagnostic: message had line with only period"

        :0 A
        {
            LOG = "
                $MATCH
                "
        }


I look for the dot only in the body; but then pull in the head when
doing the pseudo-"grep" so that I don't have to monkey with multiple
attemps in case of insufficient lines at the start of the body to
have a match before the dot.  If I use the header, too, in the display,
there should always be enough lines to catch without resorting to
funny constructs with conditional parentheticals.

Of course, if you want to see text surrounding *all* single dots
in the messsage, then we have a tougher problem outside of using
fgrep.

-- 
dman


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>