procmail
[Top] [All Lists]

Re: Header insertion from message body...

1999-12-05 17:47:40
"David W. Tamkin" wrote:

... that you spell my name "Tamkin" the way the INS did when my
great-grand-uncle came through Ellis Island.  Thanks.  Now, here 
we go.

Apologies. I plead the "first thing in the morning, caffeine not yet hit
the bloodstream" defense. 

BTW: Thanks for the extremely thorough critique of this stuff.

[Various corrections to recipe flags, sed scripts, more carefull use 
of header and body filtering, and neat tricks using scoring to do    
 arithmatic. ]

In particular I didn't realize you could just filter the head or body.
That would have saved me a _lot_ of pain.  

(Nathan called expr by its basename with no absolute path; 
why not sed?)  

Discovered differences in behavior between GNU and SunOS sed, wanted to
make sure that script I bludgeoned into working on the command line
still worked in the rc file without bothering to check what the PATH
variable was set to. Didn't have this problem with expr. 

[ Reduction of iterative method to basicly one sed command ]

Very nice. My sed command skills have progressed somewhat over the last
few days, but I'm not sure I would have come up with this one on my own. 

| However, I'm still left with my original challenge.  It seems 
| like it
| should be possible to do this without iteration.

Actually, yes, and this also takes care of continuation lines 
in the old
headers properly:

 :0Bfhw # search body, filter head
 * ^^[-a-z0-9]+:
 | formail -X "" # remove blank line at neck

 :0afwh # then in case of duplicate headers, keep last occurrence of each
 | formail -U ""

Its annoying that once one gets a particular style of solution in ones
head, all else is lost. Summarized, this solution is: remove the blank
line between header and secondary header and use formail -U "" to clean
up. 

I'd like to change these recipes a little because I only have procmail
3.11pre7 here, so I don't have '^^' to match the beginning of the first
line, and I'd like to handle the case where there is more than one empty
line between the header and secondary header. In fact, given that
formail -U "" will clean things up for us if there is no secondary
header, the condition is not absolutely necessary, though it will cause
us to do unnecessary work in the unlikely event (only specially tagged
messages get passed through these recipes) that no secondary header is
present. 

# Get rid of empty lines at beginning of body
:0fbw
| sed -n '/./,$p'

# Get rid of empty line dividing header from body 
:0afhw
| formail -X ""

# Get rid of duplicated headers and reinsert "neck" if necessary
:0afhw
| formail -U ""

Wow! This is quite an improvement. 

The only deficiencies I can see with this now are i) that any record of
replaced headers is lost, ii) any headers that appeared multiple times
in the original header are reduced to the last one (the Received header
is the only one typically in this category I think), and iii) If there
are blank lines at the beginning of a message body that doesn't contain
a secondary header, they are removed. None of these seem serious.

I'm sure that someone with better sed skills than mine could collapse
the first two filters into one, but I don't feel up to hacking sed
scripts today. :-)

Cheers!

nathan

-- 
---------------------------------------------------------------------
                                 as ci   Field of Operations Research
Nathan Edwards                  imapofa            Cornell University
nedwards(_at_)orie(_dot_)cornell(_dot_)edu       ustrali               Ithaca 
NY 14853
www.orie.cornell.edu/~nedwards       a                      
---------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>