procmail
[Top] [All Lists]

Re: Better way to remove text from beginning & end of message body?

1998-05-08 12:55:52
Andrew Kelley asked,

| A mailing list that I subscribe recently began adding a block of "boiler
| plate" text to the beginning and end of every message that goes through
| the list (groan). The text is always the same, and is always at the
| beginning and end of the message. I've cooked up a recipe to chop the text
| using tail and head (I do a line count via scoring before using head.)
| However, using two processes for this seems like a waste of resources.
| Does anyone have a better way to remove n number of lines from the
| beginning and end of a message body?

sed could do both at once, but the problem is that sed never knows when
it is N lines from the end if N>0; it knows the last line when it reads
it, but when it is looking at the next-to-last line it doesn't know that
there is only more one line to come.  It does, however, know how many lines
of input it has already read.

So I have three suggestions: if you know that the header is X lines long
[let's say 5 for this example] and that the first line of the footer contains
some string or pattern that will not occur in the significant part of the
post,

 :0bfwi
 * conditions
 | sed -ne 1,5d -e '/pattern/q' -e p

If you recognize the end by the last line that you want to keep instead
of the first line that you want to delete, omit the n option and the p
instruction:

 | sed -e 1,5d -e '/pattern/q'

Finally, if the only reliable way to spot the footer is by reaching so
many lines from the end (because any search pattern might occur in the
real text as well), we can score as you've been doing to get the number
of the last significant line.  Let's say the footer is three lines long;
because ^.*$ always counts one line too many (long story), we subtract
four instead of three:

 :0bfwi
 * conditions
 * 1^1 B ?? ^.*$
 * -4^0
 | sed -e 1,5d -e "$="q

<Prev in Thread] Current Thread [Next in Thread>