procmail
[Top] [All Lists]

Re: Recipe failing....Why?

2008-07-07 11:43:10
Thanks Sean,

I've made some clarifications within the text below. With the exception of what I am trying to accomplish with the To: header you were right on.

Scoring is something that I have not tried just yet but this old dog likes to learn new tricks and I will give your suggestion a go.

Thanks for taking the time to respond...

Jim

At 09:04 AM 7/7/2008 -0700, Professional Software Engineering wrote:

At 07:09 2008-07-07 -0700, Jim Seavey wrote:
I want to filter this stuff to /dev/null and have been working on the following recipe to do that, alas, to no avail.

It REALLY helps if you write out in English what it is you expect the conditions to accomplish, rather than dumping a recipe and saying "this doesn't work". Also, you should enable VERBOSE logging so you can see what Procmail thinks.

# Kill specific email
:0
* ^From:[       ]*(intl\.paypal\.com|\
                         chaseonline\.chase\.com)|\

That's a CONTINUATION you have at the end of that line, so the NEXT line isn't a separate condition, it is parsed as part of this. Lose the trailing |\

Mia culpa!

Right you are. What was I thinking?

Well, obviously I was not thinking. This alone solves 97% of the problem. What's that they say about not being able to see the trees for the forest?

* ^Subject:[    ]*(\<penis?\>|\
                           \<porn?\>|\
                           \<viagra?\>|\
                           \<watches?\>|\
                           \<xxx?\>|\
                          \<jwseavey?\>|\
                           \<verna?\>|\
                           \<joanreitz?\>|\
                           \<genereitz?\>)|\

Again, a continuation. Since each and every one of the keywords you're matching for is encapsulated with the same delimiters, you could simplify as well.

* ^Subject:[    ]*\<(penis|porn|viagra|watches|xxx|\
        jwseavey|verna|joanreitz|genereitz)\>

You also want to reconsider the ? you have trailing each of those keywords - that makes the immediatley preceeding token OPTIONAL (zero or one). So, in effect, anything with por (for starters) would match. Since this is anchored to the start of the subject (after whitespace only), and with word breaks on either side of the keyword, the risks of false positives aren't as great, but surely this wasn't the plan?

Right you are about the ? mark. They should not be there.

As for my wanting to filter these specific words, that is exactly what I want to do. I don't know anyone who would be communicating with me who would use these words in a Subject line; even if his is bigger than mine. :-) For some reason Spambouncer is letting stuff with this in the subject line through.


* ^To:[         ]*To:

Like, HUH? Please parse that one out to an english description of what you're trying to match.

Ya, it makes no sense but it is spam. Our domain is receiving lots of email that is including To: as the first thing in the header of the To; line. BUT, it could be that they are doing it intentionally. I just want to be shed of it when it arrives and this should to the trick.

Discounting the continuations which don't belong, and the weird To:, you do realize that ALL of these conditions would need to be met in order for a message to be matched? I suspect you're looking to flag based on ANY of them.

Yes I do want each line to be judged separately.


/dev/null

Might I suggest that you START with flagging something in the log, rather than sending things to the ether, because when something goes wrong, you won't have any email.

This is spam. If I never see it again I could care less. But, having said that, I archive the most recent 1000 email's for each user so that nothing is lost even when I screw up a script. :-)


So, here's a rewrite based on what I *THINK* you're trying to accomplish. I haven't run this - it's just off the cuff here in this reply:

:0:
* 1^0 ^From:[   ]*(intl\.paypal\.com|\
                         chaseonline\.chase\.com)
* j ^Subject:[        ]*\<(penis|porn|viagra|watches|xxx|\
        jwseavey|verna|joanreitz|genereitz)\>
* 1^0 ^To:[     ]*$
* 1^0 ! ^To:
spew.mbx

Cool. I did not realize that I could encompass all of the words as you have done.

Thanks again for taking the time to respond...

Jim

This uses SCORING - the 1^0 syntax stuff (see 'man procmailsc'), so that each of the conditions is an OR, though the nominal value being used for scoring means that each condition will still be evaluated, so a VERBOSE log will show you how many of them match something. It eliminates the bogus continuations, simplifies the Subject keyword matching, and Expands the To: condition you had to try to encapsulate both possible conditions you may have been trying to match - an EMPTY To:, and an ABSENT To:. Finally, the messages which match are deposited into a spew mailbox - handy in the event that you discover there is a major flaw in the logic and you've false-pozzied on a job offer...

---
 Sean B. Straw / Professional Software Engineering

Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html> Please DO NOT carbon me on list replies. I'll get my copy from the list.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail
************END OF ORIGINAL MESSAGE TEXT************
____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>