procmail
[Top] [All Lists]

Re: Removing line-wrapped header

2004-09-09 23:34:07
  | sed -f remove.sed

Thanks Ruud! I tested

rm_header_match_2() {
    sed '
        /^'"$1"'/!b
        :loop
        $bend
        N
        /\n[^[:blank:]]/!bloop
        h
        s/^\(.*\)\n.*$/\1/;
        /'"$2"'/{
           g; s/^\(.*\)\(\n.*\)$/\2/; D;
        }
        p; g; s/^\(.*\)\(\n.*\)$/\2/; D;
        :end
        /'"$2"'/b
        d
        '
}

The sed solution has these downsides:

#   * Matching is case-sensitive. GNU sed can make the s/// command
#     case-insensitive, but not addresses.
#   * Arbitrary expressions can not be passed in as $1 or $2 - they must fit
#     into the particular sed construction!
#   * Regexes are simple, not extended. GNU sed can make them extended with -r.
#   * Older seds need all statements terminated with ';'.

To keep apples (gawk) with apples (GNU sed), the last 2 points don't
count. I'm not good with sed so there may be a remedy, otherwise I view
point 1 as close to a k.o. Whenever I have to write '[Hh][Ee][Aa]' I'm
having inpolite thoughts about braindead tool design. Point 2 is a
typical pain with sed; for a good reusable solution fit to be put into
a library it's not acceptable. awk addresses suffer the same problem,
but it's easy to program differently; not sure whether that's possible
in sed.

Personally I find 5 lines of gawk more understandable than 13 lines of
sed, but that's probably just preference. Runtime performance speaks a
clear language though. I cat'ed an email enough times to make a 1MB
file.

LANG=C LC_COLLATE=C  other LC_* = unset

gawk:
0.38user 0.02system 0:00.42elapsed

sed:
4.56user 0.03system 0:04.65elapsed

Same input files and regex each time (how would perl compare? - sorry
can't be bothered). Either there's another issue with my system, or
sed's seriously bad. That's more than 10 times slower. I'm beginning to
think sed is overrated... but I have no interest in language wars ;)

Volker

-- 
Volker Kuhlmann                 is possibly list0570 with the domain in header
http://volker.dnsalias.net/             Please do not CC list postings to me.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>