procmail
[Top] [All Lists]

Restricting length of MATCH?

1997-10-30 01:40:50
I don't know if I'm the only person in the world who seems to have
problems with core dumps due to excessively long MATCHes. Anyhow, I
just got bit again, by the following recipe:

    # If there's an Apparently-To: but none for me, it's spam
    :0
    * ^\/Apparently-To:.*
    * $ ! ^Apparently-To:[      ]*(<$ME>|$ME)[  ]*$
    {
        # ... log MATCH, among other things

(The comment is actually a bit misleading; this will fire whenever
there is an Apparently-To: to somebody else. Even if there is also one
for me. The variable ME contains a regular expression which matches my
various logins on various hosts.)

So in comes a spam with a header something like this:

 Apparently-To: <georg(_at_)ii(_dot_)uib(_dot_)no>, 
<george(_at_)netreach(_dot_)net>, <georgi(_at_)belwue(_dot_)de>,
        <gjyoung(_at_)seminole(_dot_)gate(_dot_)net>, 
<gking(_at_)this(_dot_)aint(_dot_)my(_dot_)address(_dot_)com>,
        
<gl(_dot_)thuer(_dot_)lgst(_at_)ipn-b(_dot_)comlink(_dot_)apc(_dot_)org>,
        <glen(_dot_)turner(_at_)itd(_dot_)adelaide(_dot_)edu(_dot_)au>, 
<glenn(_at_)squirrel-net(_dot_)demon(_dot_)co(_dot_)uk>,
        <glimmung(_at_)cris(_dot_)com>, 
<gmklass(_at_)rs6000(_dot_)cmp(_dot_)ilstu(_dot_)edu>,

... and so on, something like 60-80 lines of addresses. And bang, I
get a core dump in my MAILDIR. (The header above was actually grabbed
out of the core dump with strings(1) ... lots of other interesting
stuff inside that core image, such as something that looks a lot like
users at my host with home directories and preferred shells, but I
digress.) 

For most of my recipes that actually grab something into MATCH, I have
a "bomb shelter" which bails out early if there is any header line
which is longer than LINEBUF. (This thing I have bluntly classifies it
as spam, and has never been wrong yet.) I hadn't anticipated an
Apparently-To with more than one line (and if somebody sends something
with a giant From: line or Message-Id, I'm in trouble, too), but I'm
trying to figure out a way to not move this particular bomb shelter
recipe up to the beginning of my spam.rc, for various reasons.

My first attempt was something like this:

* ^\/Apparently-To:(..........................................................\
        |.*)

where I would have anticipated MATCH to grab the first alternation if
that would match (i.e. the first 59 out of anything with 59 or more
characters), and try the second one only if it doesn't, but this
didn't work out as expected. I guess this was too practical to be
true. What I ended up with next is

* ^\/Apparently-To..?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?\
        .?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?.?

but this is +slightly+ too cumbersome to use in every single recipe
that contains a \/ -- which is practically all of them, because I want
to log what happens and on what grounds spam is being identified. 

Can anybody come up with something neater? As an aside, it would be
nice to know if indeed nobody else ever runs into core dumps when you
massively exceed LINEBUF. (I believe my mail host is actually running
3.11pre4 but I think I found the same problem in 3.11pre7 on a
completely different architecture.)

/* era */

I actually have LINEBUF set to the default, 2048. Seems to me that
bumping it up "just in case" is a form of cheating.

-- 
 Paparazzi of the Net: No matter what you do to protect your privacy,
  they'll hunt you down and spam you. <http://www.iki.fi/~era/spam/>

<Prev in Thread] Current Thread [Next in Thread>