procmail
[Top] [All Lists]

Re: blank line consolidation/strippage

2001-11-24 11:33:35
On Sat, 24 Nov 2001, David W. Tamkin wrote:

Bart suggested,

| An easier way to do the second part of that is:
|
| sed -n -e '/./,/^$/{/^$/N;p;}'
|
| That is, always print the next line after any blank line, whether that
| next line is blank or not.

I'm not so sure.  You're losing all text after the first run of two or
more blank lines, or after the first blank line that is in an
even-numbered position if you start numbering with 1 at the topmost
non-blank line.

Hrm.  I don't think so.  /./,/^$/ will begin executing 'p' at the first
non-blank line, and (because of N) will stop one line after the first
blank line.  Then blank lines will get skipped until a non-blank line is
found, at which point the process repeats.   There's no way to lose any
non-blank lines.

However, I see now there *is* a bug ... if the N consumes a non-blank line
that is immediately followed by a blank line, all following blank lines 
are discarded.  Here's output of a test case (every non-blank line in
the "test" file names its own line number):

$ sed -n -e '/./,/^$/{/^$/N;p;}' < test | cat -n
     1  this is line four
     2  and this is line five
     3
     4
     5  this is line ten
     6
     7  this is line twelve
     8  this is line sixteen
     9  this is line seventeen
    10  this is line eighteen
    11  nineteen
    12  twenty
    13
    14
    15  Have I lost anything here at line twenty-three?

But we can fix that by forcing the outer loop to start over; 'd' and 'D'
only restart the stuff in the { }, so we need a branch to a label:

sed -n -e ': 1;/./,/^$/{p;/^$/!d;n;/./b 1;p;}'

And now we get:

     1  this is line four
     2  and this is line five
     3
     4
     5  this is line ten
     6
     7  this is line twelve
     8
     9
    10  this is line sixteen
    11  this is line seventeen
    12  this is line eighteen
    13  nineteen
    14  twenty
    15
    16
    17  Have I lost anything here at line twenty-three?
    18

(The blank line that was line 24 of the input gets preserved now.)

So the full expression to squash whitespace as well, becomes (line broken 
to try to avoid wrapping):

sed -n -e ': 1;\
/[^     ]/,/^[  ]*$/{s/^[       ]*$//;p;/^$/!d;n;s/^[   ]*$//;/./b 1;p;}'

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>