procmail
[Top] [All Lists]

Re: removing whitespace between adjacent 'encoded-word's

2004-12-15 10:51:16
On Wed, Dec 15, 2004 at 03:48:47PM +0100, Robert Allerstorfer wrote:

On Wed, 15 Dec 2004, 14:57 GMT+01 Ruud H.G. van Tol wrote:

Toen wij Dallman Ross kietelden, kwam er dit uit:
Robert Allerstorfer:

Still have to think on how to convert
=?ISO-8859-1?Q?a?=  =?ISO-8859-1?Q?b?=  c =?ISO-8859-1?Q?d?=
to
=?ISO-8859-1?Q?a?==?ISO-8859-1?Q?b?=  c =?ISO-8859-1?Q?d?=
in order to deobfuscate it to
ab  c d

My biggest question is, why?

It seems to me that if you ever see anything like this, it is not
going to be mail you want to keep.  Me, I don't bother to try to read
what spammers send me.  If it's spam, it goes in the spam pile.

The quoted-printable obfuscation can be used to change "iloveyou.exe"
into "iloveyou.e=?charset?q?x?= 
                =?charset?q?e?="

exactly. This method can not only be used to obfuscate the file names
of attachments, it has in fact already been used. The first time I
received such a (virus infected) mail is now months back.

Yes, I understand the theory.  But I prefer a less brute-force
method.  That is, (a) if there is this sort of obfuscation, and (b) if
there is an attachment, then consider the thing to be malicious
and stuff the messaeg in the $VIRUS folder.  On the other hand,
if (b) is not true (no attachment), then consider the thing to
be silly spam and stuff it in the $SPAM folder.  Problem solved.


A recipe for: if more than 1 charset is used in any header, than it 
must be garbage.

  bq_regex = '=\?[-a-z0-9]+\?[bq]\?[^?]*\?='

  :0
  *$ ()${bq_regex}.*${bq_regex}
  --IN.garbage.bq/

Instead of [-a-z0-9]+ I am using
av_CHARSET = "([a-z]+[_-]?[a-z0-9-]+)"
# covers encodings like iso-8859-15 iso-8859-8-i us-ascii gb2312 shift_jis

Because the header-field-names are never encoded, the condition could
also be written as:

  *$ ^[^:]+:.*${bq_regex}.*${bq_regex}

Yes, that's good stuff.  There are several approaches, and these
all seem fine to me.


  :0
  *$ ^${bq_headers}:.*${regex_bq}.*${regex_bq}
  --IN.garbage.bq/

a nice spam test!

Yes, and if there's an attachment, basically, a nice virus test.
(I realize Ruud was talking about all headers here and not
"filename=" stuff in the body, but still . . .)

-- 
dman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>