procmail
[Top] [All Lists]

Getting the 'nth header

1998-08-31 23:50:49
  There was a recent "wishlist" discussion on the procmail
mailing list where I submitted my wish as being a string
array containing each individual header.  I've managed to
prove that it can be done, in a rather clunky fashion.  I've
probably also proved that I have way too much spare time
on my hands.

  The first step before doing anything is to boost LINEBUF,
for reasons which will become obvious later.
LINEBUF=65536

  The simple part is finding out how many headers we have.
The trivial recipe that follows will do the counting.  It's
just occurred to me that if I stuck in the B flag, it would
apply to lines in the Body instead of the headers.
:0
* 1^1 .$
{ HEADERS = $= }

  Now a list of weird regular expressions...

################# Single headers from the top #################
Get the 1st header line (the "From " line).
*  ()^^\/.*$

Insert .*$+ just after the ^^ and we now get the 2nd header.
*  ()^^.*$+\/.*$

Insert another .*$+ just after the ^^ and we now get the 3rd header.
*  ()^^.*$+.*$+\/.*$

  Etc, etc, etc, to get headers further down the list.  Note that
procmail properly unfolds "Received:" headers into one long line.


################ Multiple headers from the top ################
Get the 1st header line (the "From " line).
*  ()\/$+.*$+

  Append .*$+ and we now get the first 2 headers, including the
embedded newline character.
*  ()\/$+.*$+.*$+

  Append another .*$+ and we now get the first 3 headers,
including the two embedded newlines.
*  ()\/$+.*$+.*$+.*$+

  Etc, etc, etc, to get larger groups of headers.  This is where
boosting LINEBUF is a good idea.


############### Multiple headers from the bottom ###############
  Because of procmail's "greedy matching", I haven't been able
to do a simple match from the bottom up.

  This regex gets the last header PLUS A BLANK LINE PLUS THE
FIRST CHARACTER ON THE FIRST LINE OF THE BODY!!!
*  ()\/$+.*$$

  Insert $+.* after the \/ and we get the last 2 headers, plus
blank line plus first character of the body.
*  ()\/$+.*$+.*$$

  Insert another $+.* after the \/ and we get the last 3
headers, plus blank line plus first character of the body.
*  ()\/$+.*$+.*$+.*$$

  Etc, etc, etc, to get larger clumps of headers at the end,
plus blank line plus first character of the body.


######## Walking down the "Received:" chain from the top ########
  Wouldn't you just love to be able to "walk down the chain of
Received: headers" to be able to closely check for forgeries?
Despite what you've been told about procmail regex's being
"stateless", it can be done...

  Get the first "Received:" header from the top.
*  ^Received:.\/.*

  Insert... Received:.*$+ ...after the ^ and we get the 2nd
"Received:" header... REGARDLESS OF ANY OTHER INTERVENING HEADERS!
*  ^Received:.*$+Received:.\/.*

  Insert another... Received:.*$+ ...after the ^ and we get the
3rd "Received:" header... REGARDLESS OF ANY OTHER INTERVENING HEADERS!
*  ^Received:.*$+Received:.*$+Received:.\/.*

  Etc, etc, etc, to get further down the chain.


Walter Dnes <waltdnes(_at_)interlog(_dot_)com> procmail spamfilter
http://www.interlog.com/~waltdnes/spamdunk/spamdunk.htm
Why a fiscal conservative opposes Toronto 2008 OWE-lympics
http://www.interlog.com/~waltdnes/owe-lympics/owe-lympics.htm

<Prev in Thread] Current Thread [Next in Thread>