procmail
[Top] [All Lists]

Re: Local domain forgery detection?

2002-08-28 14:20:33

On Wed, Aug 28, 2002 at 09:52:19PM +0200, Dallman Ross wrote:

The ^ and $ imply line start or end (they are interchangeable 
in procmail, but we tend to use them linearly).  Actually, 
they each mean the literal newline char.  "^^" means the
leftmost edge or rightmost edge of the field being examined.
If I've misstated something, I look forward to correction.

Well, the traditional use of ^ and $ is to match a null at the start or
end of a line, respectively.  As such, they do not actually match any
character, including a null.  Your explanation appears to imply that
they behave like \< and \>, which is not the case.

Whenever I do procmail things, I try to make sure my regexps are as
close to "real" regexps as possible.  Since ^^ is procmail-specific,
I have never used it.  If there's a significant performance issue
with using the standard regexp atom, I'll reconsider....

 MYDOMAIN=| hostname | sed "s/`hostname -s`\.//"

I just have a real aversion to piping to two processes on every mail,
for something we can reasonably expect to get within procmail.

Yes, that's why I only run this if the MYDOMAIN variable is not already
set in procmailrc.  In the event that someone doesn't set up my public
recipes properly, they'll still run, albeit slower.

Surely
the host name, or [127.0.0.1], or "localhost" is stated in the
top Received: header?  Even if you do want to run hostname, you
could use MATCH to kill the TLD stuff and avoid sed.  

How would you structure a recipe to grab only the first occurance
of the Received line, without using an external shell to strip it?
I mean, you could easily use

 FIRSTRCVD=| grep '^Received: ' | head -1

and then strip the variable with MATCH, but how else can that be
done?  Would scoring allow you to MATCH just the first?  Like:

 :0
 * 1^0 ^Received:[$WS]*\/.+
 { FIRSTRCVD = $MATCH }

?  The man page indicates that "any subsequent matches are ignored",
but are they also exempted from MATCHing?

If you really want this variable every time, how about feeding it
to procmail via an INCLUDERC?

I do.  As I said, it only gets assigned from shell output in cases
where someone *hasn't* set it in their procmailrc.

(Lots of legit mail has Message-ID's that
violate RFCs, including Microsoft Exchange's format, I believe.

So far, aside from spam, my conservative message-id validity checks have
only caught messages from OpenSRS' trouble ticketing system, for which
this issue is a known bug.  If Microsoft Exchange breaks RFC, then of
the 30000 messages per day which I process, none are from Exchange.  Is
that good news, or what?  ;-)

of what I call "indicia" (word of art taken from Supreme Court dicta
discussing the 13th Amendment).

Actually, it's a bit more common than that.  http://www.it.ca/bin/dict?indicia

a spammy calculus.  I'll admit, though, that a forged hotmail
or yahoo address is a dead ringer.  :)

Yeah, those are by far my most requently matched recipes.  :-)

-- 
  Paul Chvostek                                             
<paul(_at_)it(_dot_)ca>
  Operations / Abuse / Whatever                          +1 416 598-0000
  it.canada - hosting and development                  http://www.it.ca/

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail