procmail
[Top] [All Lists]

Re: Received: headers after From:

1997-11-07 12:19:53
Ed Sabol continued our exchange:

| [echo] would put the last newline back, but all the other ones would
| still be eaten. (There's only one `echo' in my solution.)

That's why I purposely defined some of my variables to include a trailing
newline.

| It's a shame you have to use `echo' to replace the header (or the body) with
| the contents of some variable. I think it would be a nice feature if you
| assign a string or a variable directly to the special variables H and B.
| Hmmm. Maybe I'll e-mail that suggestion to Stephen.

Go for it, Ed.

| ... So my
| solution has the unfortunate side-effect of unwrapping all continued headers.
| I hadn't thought of that before. Actually, your solution does this as well
| since you also use $MATCH. I don't see any way of avoiding this. Do you?

I hadn't realized that, but it's true.  I suppose it could be undone, but
only with a cure that is worse than the disease.  However, we could use a
sed or perl solution from the start.

| > Here's a method with one formail and one shell but no recursion.
| 
| Very nice. It inspired me to implement a similar optimization to the
| recursive INCLUDERC solution that cuts down considerably on the number of
| recursions performed. It also makes the AFTERFROM flag completely
| unnecessary.

Just two suggestions:

| # Procmail 3.11pre5 or higher required for the following code.

That's to get trailing newlines into $MATCH, I take it.  As I showed, it
can be done in a way that works with older versions.

| * $ < ${LINEBUF}

Size conditions default to HB regardless of flags.  Since we care only
about the head here and body length should not be a factor,

 * H ?? < $LINEBUF

There are ways I think the INCLUDERC could be further optimized as well, but
I'm not up to tackling it now.

DWT

PS: Here's a proposed sed solution that doesn't unwrap continued lines (if it
works).  It removes all Received: after the first From:.  (Some relevant dif-
ferences between sed's regexps and procmail's: in sed . will match an embed-
ded newline, wildcards are greedy from left to right, and ^ does not match a
medial newline.)

savemetas=$SHELLMETAS SHELLMETAS
:0fwh
* ^From:(.*$)+Received:
| sed -e '1,/^From:/b' -e '/^Received:/!b' -e :a -e '$ {g;q;}' -e N \
  -e '/\n[!-~]/!ba' -e 's/^.*\n//' -e '/^Received:/ba'
SHELLMETAS=$savemetas

Translation: pass everything through the first From:, pass everything until
the next Received: (including any continuation lines of the first From:),
then accumulate lines in the pattern space until 

(1) you've taken in the start of the next header instead of a continuation
line; then drop all but that new line (because the rest is a Received: header
after From:).  If the remainder doesn't start with Received: it's a different
kind of header, so print it; if it does start with Received:, repeat the scan,
or

(2) you've reached the end, in which case replace it all with an empty line
[by copying the conveniently unused hold space] and bail.

Actually, since sed is case-sensitive, we should write it this way:

RECPAT="^[Rr][Ee][Cc][Ee][Ii][Vv][Ee][Dd]:" savemetas=$SHELLMETAS SHELLMETAS
:0fwh
* ^From:(.*$)+Received:
| sed -e '1,/^[Ff][Rr][Oo][Mm]:/b' -e "/$RECPAT/!b" -e :a \
  -e '$ {g;q;}' -e N -e '/\n[!-~]/!ba' -e 's/^.*\n//' -e "/$RECPAT/ba"
SHELLMETAS=$savemetas