Am 2005-04-28 00:07:34, schrieb Peter Jones:
When I first saw this question I was interested to see what responses came
up. Michelle's initial reply confirmed my own thoughts: that I was going
to have to write a little external code to handle it. I promptly
wandered off to do exactly that.
Second, my /usr/local/bin/testsubject which is a perl script:
Unfortunatly I do not know nothing about perl (or python)
Essentially this reads the '/path/to/blacklist' file, and compares each
line in it with the email subject line. Note that each line will be
treated as a regular expression, and may therefore contain any PERL
REGEXP formatting required. The line which performs the comparison
(indicated by "<- note") places a "\b" either side of the line from the
blacklist file so that it will only match on whole words; it seemed a
little too dangerous to match on partials... It performs a
This is what I do to.
Maybe a little bit slower as your perl script.
It is precisely because of some of the standard "mis-spellings" that I
decided to enable full regular expression syntax in my blacklist file.
RegEx is suppoerted in my script too...
Of course, this only gets called by procmail (via an INCLUDERC in my
main .procmailrc file) after all other filtering for known addresses, and
examination by spamassassin, has been performed. Also, I don't know how
well it would scale if "blacklist" grew too large...
Oh, if you extract the E-Mail only, a simple
grep $EMAIL $LIST_FILE
will be faster than anything else.
Because I am coding PHP and postgresql I am trying
to use it on a list of more then 3600 E-Mails...
Same for URLs in the BODY or Sniplets from Subject
which are more difficult because you need to compare
it line by line.
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
procmail mailing list Procmail homepage: http://www.procmail.org/