On Wed, 15 Jan 1997, era eriksson wrote:
On Tue, 14 Jan 1997 20:38:58 -0500 (EST), Wotan <wotan(_at_)netcom(_dot_)com>
wrote on Procmail-L:
> On Thu, 9 Jan 1997, Timothy J Luoma wrote:
>> Someone has recently begun sending me email with text as HTML
>> (using Mozilla 4.0b1 (Win95; I))
>> B) check message for HTML and convert them to plain ol ASCII
> untested, so someone will spot a bug or two. :)
Is it enough to remove just the HTML tags? Those messages will contain
a complete attachment with a second copy of the entire message. You
want to zap the whole attachment, no?
By the way, you should probably be using sed -e /<[^>]*>//g instead.
Finally, I believe the wildcard before "content-type" is redundant.
> :0
> .*content-type: text/html
> {
> :0 B
> | sed -e '/<.*>//g'
>
> :0
> Wherever
> }
Maybe something like:
:0fbw
content-type: text/html
| sed -e "/^[Cc]ontent-[Tt]ype: text/html/,/^--/d"
This will still leave some traces of the attachment header and footer,
but remove all of the body and most of the headers. If you want a
complete solution, I'd write up a Perl script or something.
Filtering through perl or a sed script would probably be best. :-) I've
seen the content-type line in the actual headers of the e-mail, so your
solution might not catch everything.
For simplicity, I'm just going to bounce all mail with html in it and
include instructions for eliminating this nonsense.
--
God must love the Common Man; He made so many of them.