procmail
[Top] [All Lists]

Re: about procmail regex operators

2010-03-28 17:35:47
Bart Schaefer <barton(_dot_)schaefer(_at_)gmail(_dot_)com> writes:

On Sun, Mar 28, 2010 at 8:26 AM, Harry Putnam <reader(_at_)newsguy(_dot_)com> 
wrote:
LuKreme <kremels(_at_)kreme(_dot_)com> writes:

One more question:  Would that rule:
  *   1^1 [a-zA-Z0-9]

Give this subject line:

  Subject:[      ]a a

A score of 1 or 2?

It'd score it 2.  I presume, though, that you meant the above to be a
regular expression rather than a subject that literally has a pair of
square brackets in it.  Here's how you'd apply that:

Oh yikes, I went a little crossways there and got part of both
worlds. ... Yeah it was supposed a subject line from message:

   Subject: a a

:0
* ^Subject:[      ]\/.*
* MATCH ?? 1^1 [a-zA-Z0-9]
{ LOG="$MATCH contains $= alphanumeric characters
" }

What I was trying to get at is that the scoring technique doesn't
really find 15 alphnumerics in a row.  

So unless there are in a row, its not a good trap.
I guess I'm either not catching on to the scoring or else it isn't
that good for these messages.

I'm reluctant to post any of the actual message headers because they
are truly nasty.  But what I was looking for is something that would
look for:

Subject: kdkfkdalkk8r98u9weklsd oislkalkfo8ewr9qw8lskdflk sa

The subject lines are mostly even more continuous than the one above
but not all of them.

So like I mentioned a recipe like this:

* ^Subject:[        ][^ ]+$

Would probably catch 90 percent.  But I still haven't thought of way to
nail the remainder... looks like if my feeble understanding of scoring is
right, then that method wouldn't do it either.

I'm thinking your first response:

RE1="[a-zA-Z0-9]"
RE2="$RE1$RE1"
RE4="$RE2$RE2"
RE8="$RE4$RE4"
RE16="$RE8$RE8"

etc, might be the better way.

Far as I've seen they all have at least something like 15-18
alphanumerics in a row even if there is a space somewhere else.

What do you think?   Is scoring a better way?

I'm about to step into the sandbox with a few tests.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail