procmail
[Top] [All Lists]

Re: Question about how to handle odd character stuff

2005-03-30 09:46:36
At 16:47 2005-03-30 +0200, Michelle Konzack wrote:
Am 2005-03-30 09:33:55, schrieb Steve Lake:

It depends HOW you get it...

I get this Sh.. only as SPAM and directly to my Mailbox, which mean,
I have only two "Received:" Header in it.

Which doesn't sound particularly direct. How many received: headers someone is going to have is going to depend greatly upon the path to their particular mail host - if you use fetchmail to retrieve mail from a remote host and inject it into your local host's SMTP, you're going to have extra headers. Anyone using a mail server which is directly connected to the internet may have messages with a SINGLE received header.

Some web mail notification scripts (i.e. cgi stuff) connect directly to the recipient's mail host and issue the mail there, so it is possible (though not very polite) to receive a legit message with a single received header -- I personally flag the characteristic at about 50% of my spam threshold, meaning that it takes more than just that to identify a message as spam.

It also used to be that a message which originated on your local mail host could have a single received header. Actually, it's quite possible that you regularly get these anyway, though with the MSA+MTA structure in sendmail for the past several significant releases, the initial submission will carry with it a received header, and then the delivery by the MTA will insert a second one - but that's sendmail - I can't speak for other MTAs.

Mail delivered directly to your mail host by legitimate local users (say, when I compose a message in my windoz mail client and send it to another user on my host) can have a single received header, even with the newer sendmails, because the message doesn't pass through the MSA (Message Submission Agent) on the host but is instead handed to the MTA and then passes along to the LDA.

BTW, your recipe seems to have more steps than necessary, and can be expressed more concisely as:

:0
* 1^0
* -1^1 ^Received:
.ATTENTION.FLT_received/

In any event, this proffered solution does not address the actual question posed by the OP - that is how to detect these hibit characters in their messages.

My furrin.rc recipe, available for download at my site (see .sigline, hit the spam filtering link) has the code that specifically checks for the hibit subjects (and from/to). Basically, you're looking for anything which has the high bit set - the recipe regexp actually has the character range of 0x80 through 0xff, but they're coded as the actual highbit character. The whole recipe file is overkill if you simply want to flag highbit characters, though you may want to review the other offerings there.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail