Re: Question about how to handle odd character stuff

Am 2005-03-30 08:01:24, schrieb Professional Software Engineering:

Which doesn't sound particularly direct.  How many received: headers 
someone is going to have is going to depend greatly upon the path to their 
particular mail host - if you use fetchmail to retrieve mail from a remote 
host and inject it into your local host's SMTP, you're going to have extra 
headers.  Anyone using a mail server which is directly connected to the 
internet may have messages with a SINGLE received header.


???

Not right, because you have the sending "Received:" Header
and minimum a second from the Receiving Mailbox/MTA.

I know only ONE enterprise which send me Messages directly,
but it is whitlisted...

Some web mail notification scripts (i.e. cgi stuff) connect directly to the 
recipient's mail host and issue the mail there, so it is possible (though 
not very polite) to receive a legit message with a single received header 
-- I personally flag the characteristic at about 50% of my spam threshold, 
meaning that it takes more than just that to identify a message as spam.


Even messages which coming into my courier-mta, have minimal
2 "Received:" Headers.

Becasue Worm/Virus infected WinMachines are sending direct,
I catch every day up to 3500 SPAMs which have only two
"Received:" Headers.

It also used to be that a message which originated on your local mail host 
could have a single received header.  Actually, it's quite possible that 
you regularly get these anyway, though with the MSA+MTA structure in 
sendmail for the past several significant releases, the initial submission 
will carry with it a received header, and then the delivery by the MTA will 
insert a second one - but that's sendmail - I can't speak for other MTAs.


I have never seen a Message with less the 2 "Received:" Headers like:

  __( 'stdin' )_________________________________________________________
 /
| From Esther(_at_)gamesactive(_dot_)com Fri Mar 18 06:03:43 2005
| Return-path: <Esther(_at_)gamesactive(_dot_)com>
| Delivery-date: Fri, 18 Mar 2005 06:03:43 +0100
| Received: from [194.97.55.191] (helo=mx7.freenet.de)
|       by mbox47.freenet.de with esmtpa (ID exim) (Exim 4.43 #14)
|       id 1DC9eB-0001uy-Fj
|       for linux4michelle(_at_)01019freenet(_dot_)de; Fri, 18 Mar 2005 
06:03:43 +0100
| Received: from ti400720a081-14286.bb.online.no ([85.164.247.206] helo 
gamesactive.com)
|       by mx7.freenet.de with smtp (Exim 4.43 #13)
|       id 1DC9eB-0008Du-1R
|       for linux4michelle(_at_)freenet(_dot_)de; Fri, 18 Mar 2005 06:03:43 
+0100
| From: "Irmgard Sample" <Esther(_at_)gamesactive(_dot_)com>
| To: "Lilac Dillon" <linux4michelle(_at_)freenet(_dot_)de>
| Subject: Re: /P-9TI1/Ph.armaccy
| Date: Fri, 18 Mar 2005 00:44:16 -0500
 \______________________________________________________________________

Do not read RFC ?

Mail delivered directly to your mail host by legitimate local users (say, 
when I compose a message in my windoz mail client and send it to another 
user on my host) can have a single received header, even with the newer

TWO

sendmails, because the message doesn't pass through the MSA (Message 
Submission Agent) on the host but is instead handed to the MTA and then 
passes along to the LDA.


I do not allow messages from Dynamic IP's

BTW, your recipe seems to have more steps than necessary, and can be 
expressed more concisely as:

:0
* 1^0
* -1^1 ^Received:
.ATTENTION.FLT_received/


Oh, my Version was a little bit striped :-)
The full version is:

  __( '/home/michelle.konzack/.procmail/FLT_received' )_________________
 /
| ####################################################################
| # 
| # FLT_received
| # 
| ####################################################################
| 
| LOG="($TDPID) FLT_received      : pass "
| 
| :0 
| * 1^1 ^Received:
| {
|   RCVD_COUNT = "$="
| 
|   LOG="($RCVD_COUNT)
| "
|   :0
|   * RCVD_COUNT ?? ^^2^^
|   .ATTENTION.FLT_received/
| }
 \______________________________________________________________________

:-)

Should I grep my Logfiles for "FLT_received" ?
Like:

  __( command 'grep "FLT_received " 
/home/michelle.konzack/log/procmail/2005-03-30.log' )_
 /
| (24442) FLT_received      : pass (2)
| (25028) FLT_received      : pass (7)
| (27503) FLT_received      : pass (6)
| (29758) FLT_received      : pass (5)
| (32012) FLT_received      : pass (5)
| (1827) FLT_received      : pass (6)
<snip>
 \______________________________________________________________________

In any event, this proffered solution does not address the actual question 
posed by the OP - that is how to detect these hibit characters in their 
messages.

My furrin.rc recipe, available for download at my site (see .sigline, hit 
the spam filtering link) has the code that specifically checks for the 
hibit subjects (and from/to).  Basically, you're looking for anything which 
has the high bit set - the recipe regexp actually has the character range 
of 0x80 through 0xff, but they're coded as the actual highbit 
character.  The whole recipe file is overkill if you simply want to flag 
highbit characters, though you may want to review the other offerings there.


Does it let the french word "légére" through ?

Greetings
Michelle

-- 
Linux-User #280138 with the Linux Counter, http://counter.li.org/ 
Michelle Konzack   Apt. 917                  ICQ #328449886
                   50, rue de Soultz         MSM LinuxMichi
0033/3/88452356    67100 Strasbourg/France   IRC #Debian (irc.icq.com)

signature.pgp
Description: Digital signature

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail