procmail
[Top] [All Lists]

Re: blank line consolodation/strippage

2001-11-19 14:44:27
At 09:30 2001-11-19 -0600, David W. Tamkin wrote:
|I'm hoping someone else might already have a blank line compressor?

  sed '/./,/^$/!d'

Thanks, but that reduces all blocks of blank lines to a SINGLE newline. I want to retain the logical spacing that multiple blank lines is sometimes used to denote (one blank line may separate paragraphs, two or more are sometimes used to "shift gears" so to speak).

To do what I want (anything more than 2 blank lines, consolodated to just 2, plus reduce all-whitespace lines to just a newline), I've got to use the following somewhat awkward (I'm not keen on commands which must flow between lines) sed invocation:

# executed from within some other condition block
:0f
* ! B ?? ^-----BEGIN PGP SIGNED MESSAGE-----
| sed -e 's/^[  ][      ]*$//'|sed -e '/^$/N \
        /^\n$/{N \
                /^\n\n$/D \
}'

There's a problem actually RUNNING the above from within procmail (more on that below), but I'd also like to combine these two invocations into one -- the best I could come up with is:

# Sed script to compress lines of all whitespace to just newlines, and
# reduce consecutive blank lines to no more than 2 blank lines.

/^[     ]*$/{
        s/[     ]*//
        N
        /^\n[   ]*$/{
                s/^\n[  ]*$/\
/
:nexblank
                N
                /^\n\n[         ]*$/{
                        s/^\n\n[        ]*$/\
/
                        t nexblank
                }
        }
}

This is saved as a sed file invoked using the -f argument, because for the life of me, I can't get that multiline sed command to run within procmail - as a shell script or sed file, the arguments work - but if I use the command as shown at the top of this message, sed balks:

(this is the sed from the top of this doc, minus the first invocation):

| sed -e '/^$/N \
/^\n$/{N \
/^\n\n$/D \
}'

The procmail log reports:
        sed: -e expression #1, char 6: Extra characters after command

If I omit the continuation backslashes, it skips the additional lines. Sed actually WANTS the script as several lines, not concatenated into one long string. I'd rather have the sed script inside the procmailrc - one less file being open at any given time, plus the logic is laid out there for easy, if not cryptic, viewing.

Is there some trick to using procedural sed commands such as these from procmail, or is my sed (gnu, 3.02) being argumentative where a possibly newer one might not?

Note that the whitespace reduction is anchored to the beginning of the line: this intentionally will not strip trailing whitespace, which is crucial to

        Content-type: text/plain; charset="us-ascii"; format=flowed

(a format used by Eudora Pro), wherein lines can be auto-rewrapped in replies because the trailing space allows the next line to be concatenated, or something to that effect. I'd just as soon not muck with those.

Of course, both whitespace reduction and blank line compaction will have a negative impact on signed messages, which is why I check the body for the PGP marker that I do (as used, attachments have already been dealt with before reaching this script, so I don't need to worry about clobbering content within an attachment).

If it is essential to strip all empty lines from the bottom,

I'm happy just reducing the long trailers some people put in their messages, sometimes unknowingly. This is part of a filter set which will hopefully dramatically clean up new messages going into a mailing list and its archives.

Thanks for the feedback thus far.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>