procmail
[Top] [All Lists]

Re: HTML Filter Recipe

1998-03-01 16:38:45
Trevor Astrope wrote:

Hi, I'm really getting tired of receiving messages in html. It seems this
is the default in MS's Outlook mailer. Can anyone give me pointers to a
recipe that would bounce html messages back to the sender with a text file
that included instructions on how they could set Outlook or netscape to
send in plain text instead of html?

Here's a recipe I'm using:

:0 BH
* !^FROM_DAEMON
* !^X-Loop: eristic(_at_)gryzmak(_dot_)lodz(_dot_)pdi(_dot_)net
* ^Content.Type.+multipart.alternative
* ^Content.Type.+text.html
{
        LOG="$CRLF --TRASH: multi-part HTML content:$CRLF"
        :0
    | (formail -rtk -A"X-Mailer: Procmail Autoreply" \
    -A"X-Loop: eristic(_at_)gryzmak(_dot_)lodz(_dot_)pdi(_dot_)net" ; \
cat $HOME/no_HTML_please) | $SENDMAIL -oi -t
}

Synopsis: (a) not from a mailer daemon (htough I don't suppose bots would
ever be *that* clueless ;) (b) not with a X-Loop header to prevent inifnite
loops; (c) the first "ontent-type" is a header line; (d) the second
"content-type" occurs in body, just before the HTML section of the MIME
message.

The LOG line is obvious. The original message is returned to sender with
the contents of my file called no_HTML_please appended. (The text file
lives in my $HOME.)

This is based on my own observations of what the multi-part messages tend
to look like. The scan is possibly expensive since it looks at the body
too; I'm sure someone else will improve on my recipe.

Aside: In my no_HTML_please I politely explain why I don't appreciate
receiving HTML email, and ask to resend the messahe as plaintext. What
happens in the majority of cases is that the sender resends the same
message again ("oh, it bounced, let's try again") and I sassume they don't
actually read my explanation since they just happily resend the HTML cr*p.
It bounces again at which point they give up... Tough luck, I say ;)

(BTW, the above recipe is placed *after* mailing list mail gets sorted.
When someone sends HTML mail to a mailing list I read, I just flame them in
person ;)


Further, I am also using:

:0 B
* ^^[   ]*<html>
{
        LOG="$CRLF --TRASH: <HTML> coded body text:$CRLF"
        :0:
        $TRASH
}


This detects non-MIME mail where the body begins with the <HTML> tag.
Particularly characteristic of some spam mailer widely in use. Note that I
am not bouncing this kind of mail, just deleting it in sight. (The square
brackets contain a space and a tab, as usual.)

.marek



-- 
"This is all very interesting, and I daresay you already
see me frothing at the mouth in a fit; but no, I am not;
I am just winking happy thoughts into a little tiddle cup."
(Vladimir Nabokov)

<Prev in Thread] Current Thread [Next in Thread>