procmail
[Top] [All Lists]

Re: Need for assistance

2005-08-07 08:46:22
At 11:11 2005-08-07 +0300, Udi Mottelo wrote:
On Sat, 6 Aug 2005, Tony McClenny wrote:

I would appreciate your assistance in discovering the proper code to
eliminate an incoming message with foreign (to me) language characters in
the subject line.  I believe I should use a filter for the "Windows-1251"
data, but have thus far been unsuccessful in my attempts.  A sample of what
is incoming is shown below as is my unsuccessful code attempt.

The incoming header information:

Subject: .!! :)Èçìåðèòåëü àðòåðèàëüíîãî äàâëåíè^? Meditech
Date: Thu, 4 Aug 2005 12:42:01 +0000
MIME-Version: 1.0
X-Mailer: Microsoft Office Outlook, Build 11.0.5510
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Thread-Index: IJZXc5yLv8uV7AbvvLnbve4EHrG48BgkRF6B
Content-Type: text/plain;
       charset="Windows-1251"
Content-Transfer-Encoding: 8bit

   [ The following text is in the "Windows-1251" character set. ]
   [ Your display is set for the "ISO-8859-1" character set.  ]
   [ Some characters may be displayed incorrectly. ]

My procmailrc file code, which is failing:

:0 h
* .*\<Windows-1251\>
/dev/null

A few more things to add to Udi's notes:

There's no need to use .* at the beginning of a regexp, since unless you 
ANCHOR to the beginning of the line (using " ^ "), the regexp will match 
anywhere in the line.  In this particular case, since the material you're 
looking for is supposed to be in a content-type header, you really should 
just look for it there:

* ^Content-Type:.*\<"Windows-1251\>

the .* is legit here, because it skips over the other content which may be 
in that header between the beginning (which is defined here), and the text 
you're looking for.  Just looking for Windows-1251 in the headers would 
catch it say, if it were a subject matter (er, say if YOU had posted to 
this list saying you wanted to discard such messages "filtering out 
Windows-1251 charsets" or somesuch).


When TESTING recipes, don't deliver to /dev/null.  This is a great way to 
lose a much of legit email because of some simple goof, and unlike 
misfiling it, you can't reprocess it (unless you have a separate backup 
recipe taking place first).  File it to another mailbox, or simply emit a 
logfile entry indicating you'd have ditched that message.

Check out the sandbox (testing setup) which is located at the URL in my 
.sig.  While you're there, you might also want to check out the "furrin.rc" 
script which identifies messages in foreigh character sets - you can edit 
it to change what you categorize as foreign and unreadable to you 
(everything is in general geo-linguistic groupings).

---
  Sean B. Straw / Professional Software Engineering

  Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
  Please DO NOT carbon me on list replies.  I'll get my copy from the list.



____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>