You should demime the messages as well - quoted-printable and base-64
encoded messages won't scan very well at all. Of course, IME, such
messages are spam anyway, so why scan them, when you can file 'em away on
that simple characteristic unto itself?
Maybe okay for base64, but QP wouldn't surprise me in legit
non-English mail, on the contrary.
Of course, if it's a legit HTML document, you'll have EXACTLY the
same number of < as you do >, because they should all be paired
Not necessarily, for several reasons (and tag-stripping scripts have
to be aware of these too):
- an unescaped ">" is no problem,
- an unescaped "<" isn't either if it can't be mistaken for a tag start
anyway - for instance, if the next character is a space,
- and you can have both "<" and ">" unescaped in a string within a tag
(e.g., <IMG src="equation.jpg" alt="a + b > c">).
procmail mailing list