Mark asked,
I have received numerous spam messages that appear to have embeded
newline characters in the middle of the subject line.
I have tried:
* ^Subject:.+()$.+
To match this, but it catches everything.
In the body, ^ or $ will match the newline between two lines of text (or a
putative one, but we don't need to get into that for this problem); however,
in the head, they match the newline between two header fields (or a putative
one). They will not match the soft newline in the middle of a header field
with a continuation line, and that's what Mark is looking for.
Every message will match the regexp Mark tried as long as it has a Subject:
header at all. The only exceptions are negligible; if Subject: is the
bottommost field in the header or if it is totally empty, not even having a
space after the colon. You'll probably never see either of those unless
somebody deliberately contrives a message that way just to match that regexp,
so essentially it will catch any message that has a subject.
The question of how to test for a continued header line has come up before.
Here's one solution:
:0wh
* ^\/Subject:.*
dummy=| egrep "$\MATCH"
:0e: # grep exited 1 if the subject had a continuation
continued_subjects
Note that the following similar approach will not work, because procmail will
unfold the continued header line before feeding text to an exit code test:
:0: # This does not work.
* ^\/Subject:.*
* ! ? egrep "$\MATCH"
continued_subjects
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail