On Tue, Jun 17, 2003 at 08:58:34AM -0700, Bart Schaefer wrote:
:0 E
* PARTS ?? ^^1^^
* ^Content-Type: text/html
{ W="multipart header but single html part body" }
That should be ":0 EB", no? (Or B?? on the second condition.)
Yes -- more cut-and-paste errors. I have things structure differently
in the recipe file that uses this stuff.
Some MUAs do that if you attach (say) an image and then send the message
without typing anything into the editor. That is, they won't promote a
single "attachment" to a top-level content-type: image/*. If the sender
happened to attach an empty file ...
Grr. We're expected to predict *every* variable.... I guess I'll
readjust if I start seeing false positives.
Ooh, hadn't thought of that. I guess you could just handle that with:
* ! B ?? Content-Type: multipart
Depends on what "handle that" means, I suppose.
Since the goal is to block spam that contains an empty text/plain part,
a message with an embedded multipart part will almost certainly not be
spam. If I forward a few messages to SpamCop, I don't want use of the
filter on my BCC to trap the messages. (Whitelist notwithstanding.)
The latter. Usually I wouldn't do that except in a context like this:
* $ B ?? ^$\MATCH($)Content-
where I don't want "$Content" to be taken as a variable name. That
mistake would bother me more than Ruud's giggling does.
How Ruud of him to giggle...
Wouldn't you get the same effect by escaping the $ with a backslash? Or
is bracketing it just easier on the eyes?
* $ B ?? ^$\MATCH\$Content-
If I understand correctly, you're worried about this sort of thing:
--theboundarystring
Content-Type: text/plain
--theboundarystring
Content-Type: text/html
<p>
--theboundarystring--
Yes, you're right, that would fail to match the test above (in several
ways) because it's such badly-formatted MIME. The change you suggest
wouldn't be enough to fix it; I think I'd just test for that kind of
syntax problem separately, rather than treat it as an "empty part".
It wouldn't hurt to look for broken MIME, but I'm under the impression
that most ratware manages to get that stuff right these days.
I see what's missing; switching the * to a + would still match parts
with a non-zero amount of whitespace. I guess you'd need:
* $ B ?? ^--$\MATCH$NL(\
(.+$NL)*\
Content-Type:[$WSPC]+text/plain.*$NL|\
(.+$NL)*\
)?$NL\
([$WSPC]*$NL)*$NL\
^--$\MATCH(--)?$NL
Is that finished now? :)
--
Paul Chvostek
<paul(_at_)it(_dot_)ca>
Operations / Abuse / Whatever
it.canada, hosting and development http://www.it.ca/
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail