procmail
[Top] [All Lists]

Re: Use scoring to determine header format?

2004-05-18 10:16:52
At 11:15 2004-05-18 -0400, fleet(_at_)teachout(_dot_)org wrote:
Again, this works for the first three Received: and Message-Id: lines, but
it continues to catch those with more than one Received: line *after* the
Message-Id.

Is this really a problem? I think if you've got legit messages with recieved: before Message-ID, AND after, that the number after isn't going to be consistent enough to weigh it heavily as spam or not.

  Am I misunderstanding (.*$)?  I read it as "any number of any
character followed by a newline (or EOL)."

The trailing ? on that expression means ZERO or ONE (i.e. "preceeding expression is optional"). For headers, this would be a bogus construct - all the headers will bave zero or more characters following them '.*' (zero if it's a blank header line, like an empty subject) AND will certainly be terminated with a newline '$', so ZERO isn't a valid goal here. ONE is a regular header, ONE or MORE '+' is the content of this header, and some number of optional, additional headers that may be between this one and the next one we happen to care about.

I'd really recomment you pick up a good text on Regexps (keeping in mind that not all implementationa and extensions are universally compatible, but about 90% of regexp is quite standard across apps which use regexp).

I see "patterns."  Don't know if it's just the way I am or my crytologic
training (or both); but I see patterns.

I see dead people.  Shhhhhh.

The "guy" I'm after is this fellow that always has the RCVD RCVD RCVD
MSGID RCVD pattern and a one-work Subject.

Surely there are other characteristics. The one-word subject itself could be a beneficial test:

:0
* ^Subject:[    ]*[a-z]+[       ]*$

Note no symbols or numerics, so Re:/Fwd prefixes would skip this, and since we're looking for one or more characters, a BLANK subject would likewise skip this. I specify the optional trailing blank and anchor to the EOL because we need to be sure there is nothing else following the singular word.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail