procmail
[Top] [All Lists]

Re: Recipe matching

2001-12-18 14:33:55
Paul Chvostek <paul(_at_)it(_dot_)ca> says...

On Tue, Dec 18, 2001 at 12:44:17PM +0100, Dallman Ross wrote:

:0
* ^Received:[.ch>$]      ?

Um, no...  As I was forcibly taught recently, square brackets indicate
a *range*, so the above condition would match *every* email that came
in with a Received line in the headers.  :)

That isn't exactly what you were forcibly taught. :-)  The

No, honest, it IS what I was taught!  The fundamental different between
a range and an atom.  Jack doesn't want to match "any of a set of
characters including any character, c, h, > and EOL", so he's likely
thinking ofr (\.ch>) rather than [.ch>$].

$ in there is not part of the range of, e.g., from dot to EOL.
Rather, the $ it is a literal dollar sign inside the brackets.
It looses its meta qualities.

  * ^Received: [a-z0-9]\.(ch|pl)[         <(]

Probably closer would be:

    * ^Received:.*[a-z0-9]\.(ch|pl)([       >)]|$)

or something.

Um...  Ya, I missed the '.*' after 'Received:'.

I dunno about your mail logs, but on every Received line I've looked at,
the character following the fqdn of a relay is invariably an *opening*
bracket of some sort; the closing brackets come at the end of the IP
address, not the hostname.  For example:

Received: from mail.epost.de (mail.epost.de [64.39.38.72] (may be forged))
        by haggis.it.ca (8.11.6/8.11.6) with ESMTP id fBIBh4663053
        for <paul(_at_)it(_dot_)ca>; Tue, 18 Dec 2001 06:43:10 -0500 (EST)
        (envelope-from dman(_at_)nomotek(_dot_)com)

Given that the first character after the hostname *always* seems to be a
space, that part of BOTH our conditions will always match because we
don't look beyond the first character.  Searching the headers of my
400MB+ of archived mail turned up *no* matches for the regexp
'^(Received:|   ).*\.[a-z][a-z][<(]' ... but a search for
'^(Received:|   ).*\.[a-z][a-z][>)]' only turned up lines like:

Received: from cpu1751.adsl.bellglobal.com ([206.47.27.232] 
helo=giles.striker.ottawa.on.ca)
Received: from ns.nexus.net.mx ([200.23.227.5] helo=ns.tab.net.mx)
Received: from unknown (HELO mail.nexus.net.mx) (200.23.227.31)
Received: from cybersparc-02.cybertron.at ([212.236.212.11] 
helo=mgate1.cybertron.at)
Received: from [194.51.163.253] (helo=bow.intnet.bj)

none of which are really the kind of thing we're looking for.  So I'd
make it:

      * ^Received: .*[a-z0-9]\.(ch|pl) ()

or maybe even

      * ^Received: .*\([a-z0-9\.]+\.(ch|pl)[ ]

Is there supposed to be a space after the  :  and before the .     ?

These are the lines I've added to my .procmailrc after deleting the ones I
had in there previously.


:0
* ^Received: .*\([a-z0-9\.]+\.(ae|ar|at|au|br|ca|ch|cn|cz)[ ]
/u/ja/jac/mail/junk

:0
* ^Received: .*\([a-z0-9\.]+\.(de|in|ir|jp|kr|mgc|nl|pg|pl)[ ]
/u/ja/jac/mail/junk

:0
* ^Received: .*\([a-z0-9\.]+\.(ru|se|tw|ua|uk)[ ]
/u/ja/jac/mail/junk



Thanks to -everybody- who's tried to help me with this.  It's more
complicated than I ever imagined, for -seemingly- a simple objective.

And yes, I'm willing to blacklist (at least for now) a lot of the
countries from which this unwanted spam originates.

                                Jack

-- 
aka Keet        Visit my web page at http://junior.apk.net/~jac/
    * If you post a followup, -DO NOT- email me a copy of it! *
    Fun photo contest: http://home.dal.net/jam/kimva_photo.html
"We were trying to compete with The Beach Boys and Pet Sounds" - Guess!

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>