procmail
[Top] [All Lists]

Re: HTML to text

2003-01-23 14:03:56
Kreemeleh has,

| >   :0
| >   | sed -e '/--$\BOUNDS(.*)\<Content-Type: text\/html/,/--$\BOUNDS/d'

and wondered,

| Ooops.  I thought that was working, but it is not.  As near as I can
| tell the $BOUNDS is not expanded.
|
| What did I forget?

First, there is an asterisk in there, so procmail will pass the sed command to
the shell.

Second, shells don't understand the procmailism $\VARIABLE.

Third, if the shell did understand it and did expand it, sed doesn't see
parentheses as special by default, so the opening "()" in the expansion of
$\BOUNDS would make sed look for literal parentheses, which it wouldn't find,
so the deletion would never start.

Fourth, variables are not expanded inside strong quotes (that's one of the
reasons people use strong quotes some of the time: to *prevent* variable
substitution).  You need soft quotes.

Fifth, sed wouldn't recognize \< as a special and would look for a literal
less-than sign.

Sixth, sed can't match on multiline expressions unless you have taken some
effort to get more than one line into the pattern space, such as an N or G
command (or, after an H, g or x).  I think you're expecting \< to match to a
newline.

I haven't the time this afternoon to rewrite the thing, so maybe someone else
will.  If you're trying to do text alteration starting at a multiline pattern,
maybe perl is easier to code it in than sed.



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>