On Tue, 14 Jan 1997 20:38:58 -0500 (EST), Wotan <wotan(_at_)netcom(_dot_)com>
wrote on Procmail-L:
On Thu, 9 Jan 1997, Timothy J Luoma wrote:
Someone has recently begun sending me email with text as HTML
(using Mozilla 4.0b1 (Win95; I))
Just like the very clueful unsubscribe sent to the list not much later
than Wotan's message. I love it! Humorous, aesthetical, less filling!
B) check message for HTML and convert them to plain ol ASCII
untested, so someone will spot a bug or two. :)
Is it enough to remove just the HTML tags? Those messages will contain
a complete attachment with a second copy of the entire message. You
want to zap the whole attachment, no?
By the way, you should probably be using sed -e /<[^>]*>//g instead.
Finally, I believe the wildcard before "content-type" is redundant.
:0
.*content-type: text/html
{
:0 B
| sed -e '/<.*>//g'
:0
Wherever
}
Maybe something like:
:0fbw
content-type: text/html
| sed -e "/^[Cc]ontent-[Tt]ype: text/html/,/^--/d"
This will still leave some traces of the attachment header and footer,
but remove all of the body and most of the headers. If you want a
complete solution, I'd write up a Perl script or something.
/* era */
--
See <http://www.ling.helsinki.fi/~reriksso/> for mantra, disclaimer, etc.
* If you enjoy getting spam, I'd appreciate it if you'd register yourself
at the following URL: <http://www.ling.helsinki.fi/~reriksso/spam.html>