procmail
[Top] [All Lists]

Re: rule to catch a certain number of characters

2007-05-26 12:05:41
In an older episode (Saturday, 26. May 2007 14:03), Dallman Ross wrote:
wolfgang wrote on Saturday, May 26, 2007 12:26 PM:
Lately, I have received various similar mails with one line
containg about 100,000 non-space characters.

To match those, I have tried a B rule with something like
[^ ]{10000}
but apparently that is not the correct syntax. How should it be
done?

That "extended-egrep" syntax is not supported in procmail.
You will have to count.  You can do it in various ways.
One way might be to use scoring.  But see below for an
easy alternative.

There are problems with your algorithm otherwise, in any case.
The biggest one is, an encoded message such as MIME or uuencode
will also be caught by your condition.  Also, are you sure there
are no line-ends in these messages?

There are line ends, but not in that one >100K line.

They are spam mails, Content-Type: text/html;

They start with:
<title>
lineofmorethanonehundredthousandcharacters
</title>
and then the HTML spam message starts, containing links to URLs and 
images on the web plus DIV and FONT tags.

I think they are designed to provoke spamassassin/spamc timeouts which 
works on elder machines with weak processors in my case.

I want to match that one overly long unbroken line without spaces. MIME 
or uuencode would contain linebreaks wouldn't they - so I assumed 
my "egrep style" algorithm wouldn't catch those.


Do these messages have any spaces in them at all?  If not,
then I would look for them that way: "If there is a space,
this is not one of those messages."

They do have spaces within HTML tags, e.g. "<FONT face=...", not in the 
line in question, though.

If there are spaces otherwise, then you might simply be
reacting to multipart encoded messages.

Nope, see above.

So, how would I - not familiar with scoring so far - match that long 
line?

Thanks,

wolfgang

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail