procmail
[Top] [All Lists]

Re: help trimming Received headers?

1997-11-05 16:08:00
wwgrol(_at_)sparc01(_dot_)fw(_dot_)hac(_dot_)com (W. Wesley Groleau x4923) 
writes:
I received several good suggestions that almost did it, but now I realize
I was not clear in my description.  Here is a sample set:

A> From procmail-request(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE  Wed Nov  
5 10:04:40 1997
B> Return-Path: <procmail-request(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE>
C>     id KAA18509; Wed, 5 Nov 1997 10:04:37 -0500
D> Received: by most.fw.hac.com (4.1/SMI-4.1)
E>     id AA18011; Wed, 5 Nov 97 10:04:38 EST
F> Received: from gw1.fw.hac.com(151.168.254.46) by most via smap (V1.5khhunt)
G>     id sma017991; Wed Nov  5 10:04:03 1997
H> Received: by  gw1.fw.hac.com (4.1/SMI-4.1)
I>     id AA12888; Wed, 5 Nov 97 10:03:58 EST
J> Received: from campino.informatik.rwth-aachen.de(137.226.116.240) by gw1.hu
ghes-defense-comm.com via smap (3.2)
K>     id xma012881; Wed, 5 Nov 97 10:03:46 -0500
L> Received: (from lists(_at_)localhost) by Campino.Informatik.RWTH-Aachen.DE 
(8.8.
7/RBI-Z13) id QAA27449 for wwgrol(_at_)sparc01(_dot_)fw(_dot_)hac(_dot_)com; 
Wed, 5 Nov 1997 16:03:14 
+0100 (MET)
M> Resent-Date: Wed, 5 Nov 1997 16:03:14 +0100 (MET)
N> X-Authentication-Warning: campino.informatik.rwth-aachen.de: lists set send
er to procmail-request(_at_)informatik(_dot_)rwth-aachen(_dot_)de using -f
O> Message-Id: <m0xT6L9-000k0oC(_at_)miso(_dot_)wwa(_dot_)com>
P> From: dattier(_at_)wwa(_dot_)com (David W. Tamkin)
Q> Subject: recipe flag sequence
R> To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE (Procmail Mailing 
List)
S> Date: Wed, 5 Nov 1997 08:21:51 -0600 (CST)
T> In-Reply-To: 
<tbhg9rlex0(_dot_)fsf(_at_)pegasus(_dot_)tele(_dot_)nokia(_dot_)fi> from 
"jari(_dot_)aalto(_at_)poboxe
s.com" at Nov 5, 97 02:10:35 pm
U> X-Mailer: ELM [version 2.4 PL23]
V> Et-cetera:

DE may vary, but it will always be at the top.  Unfortunately, the host
may vary, AND will not always be fully qualified.

FG and HI also may vary, and there may be a lot more of them.

JK is the first one I want to KEEP.  DE through HI I want to discard.

There may be one or more at position L which I want to KEEP, and which
might not mention a host in our domain.

And as someone else mentioned, it would be nice to drop those (if any)
that appear _after_ P.  But don't forget the ':' lest we drop any after A.

Okay, let my try this _one_more_time_ (having blown it twice): you want
to keep the _last_ Received: header that matches some regexp, say,
/ by ([-a-z0-9]+\.)*(hughes-defense-comm|hac)\.com([^-a-z0-9.]|$)/,
plus Received: headers that don't match that nasty regexp, plus any
other headers.  Okay, here we go:

strip-local-received.pl:
#!/usr/local/bin/perl

while(<>) {
    if (/^(\S|$)/) {
        if ($h =~ /^Received:/) {
            if($h =~ /[ ]by[ ]          # Does the " by " clause of the
                                        # Received: header contain
                        ([-a-z0-9]+\.)* # some hostname parts (optional)
                        (hughes-defense-comm|hac)\.com
                                        # then one of the local domains
                        ([^-a-z0-9.]|$) # then something to indicate that
                                        # the above was the entire FQDN
                                        # and not part of something else?
                     /x) {      # This is a perl5 extended regexp
                # Yes: store this as the current "last one seen"
                $last = $h;
            } else {
                # Okay, it wasn't a local one.  Print the last local one
                # we saw, then this one.
                print $last, $h;

                # clear the last holder for the exit.
                undef $last;
            }
        } else {
            # Not a Received: header, so print it.
            print $h;
        }
        # store the current line (which is the start of _another_ header)
        # in the 'header accumulator', $h
        $h = $_;
    } else {
        # header continuation: add it to the growing header and start over.
        $h .= $_;
    }
}
# Print any buffered Received: header (in case the message was completely
# local), then the final header (which is actually just a blank line).
print $last, $h;


Then the procmail recipe would be:

        :0 fhw
        |strip-local-received.pl


Depending on how much of your mail has the umpteen local Received: lines,
you may want to condition that recipe on their being at least one line
to remove, as David Tamkin suggested in another message.


BTW, isn't it wierd that some Received headers 'continue' at the ID and
others just run on without a break?

The formatting of Received: headers is up to the MTA.  For instance,
Eric Allman has changed the default format set in 'stock' sendmail.cf
files several times.


Philip Guenther

<Prev in Thread] Current Thread [Next in Thread>