procmail
[Top] [All Lists]

Re: ASCII 128-255

2005-05-12 15:07:11
On Thu, May 12, 2005 at 10:47:30PM +0200, Tomi Crnicki wrote:
Hello!

I suppose somebody asked this before but I couldn't find a helpfull 
link.

Can I somehow filter out with procmail messages coming mostly from 
Russia and ex-Russian countries and the Far East that don't have any 
charset tags or have f.i. charset Win1251 (some others also 
sometimes) and are filled with characters (subject and/or message) 
with ASCII codes 128-255. All such messages are spam for our users.

Again I can't filter them all out as some messages have charset 
Win1251 but are not spam - these no-spam messages contain only the 
ASCII characters 32-127.

I believe it was only last week that we last addressed this issue.
A good place to look is the list archives (viewable from a link
a fair way down the page at www.procmail.org).  E.g., search for
"hi-bit"; "hibit"; "non-printing"; etc.  Here is part of what I
posted last week:

From: Dallman Ross
Sent: Tuesday, May 03, 2005 11:23 PM
To: procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
Subject: Re: exclude rules for percentage of high-bit characters


On Tue, May 03, 2005 at 03:58:42PM -0500, Christopher L. Barnard
wrote:

I would like to exclude email that is mostly unprintable
characters. [. . . ]

[. . . .]  I'm sure that someone has done this, I'm just not using
the right keyword in my search of the list archives.  Can someone
point me to how I would go about doing this?

This is one of various messages in the list archives about the
subject.  I did my search on "non-printing characters."  I
was aided in knowing what I was looking for.  (This was it.)
But there are other archived messages as well, including a
few from me about excluding German chars (for instance.  You
could search further with relative ease.  For example, try
"chars" instead of "characters."

http://www.xray.mpe.mpg.de/cgi-bin/w3glimpse2html/procmail/2001-09/msg00281.html?53#mfs

Basically, this should do it (untested, however):

 SPACE = ' '
 TAB = '        '

 :0
 * BH ?? [^$TAB$SPACE-~]
 { HIBIT = TRUE }


-- 
dman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>