| I give heavy weight to Subjects starting with (un)?s*bscribe, with
| also pretty heavy weight to Subjects containing either of those words.
| I then give heavy weight to the body of messages starting with those
| words, and a lighter weight to lines starting with them. Then
| multiple occurrences get some weight too, up to a point. Then I count
| the words in the message against all that.
How about looking for sub & unsub, as well as a perenial
misspelling 'unsuscribe me'? I also find filtering on add, leave and
help to be useful. This may well be the only word on the line. I
think it has to do with broken list management packages.
Were I in an impolite mood, I might suggest taking off points
for users of aol sending the message.
| * 1^0
| * 30^0 H ?? ^Subject: +(un)?subscribe\>
* 20^0 H ?? ^Subject: +(un)?sub?(scribe)?\>
(The B is often missing, as is the word fragment 'scribe')
| * 20^0 H ?? ^Subject:.*\<(un)?subscribe\>
* 20^0 H ?? ^Subject: +(add|leave|help)$
* 15^0 H ?? ^Subject: +(add|leave|help) # fewer points if more words
| * 20^0 ^^([ ]|$)*(un)?subscribe\>
| * 10^0 ^([ ])*(un)?subscribe\>
| * 8^.4 \\<(un)?subscribe\>
| * -.4^1 \\<[A-Za-z]+\>
"It is seldom that liberty of any kind is lost all at once."