On Wed, 15 Dec 2004, 14:57 GMT+01 Ruud H.G. van Tol wrote:
Toen wij Dallman Ross kietelden, kwam er dit uit:
Robert Allerstorfer:
Still have to think on how to convert
=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?= c =?ISO-8859-1?Q?d?=
to
=?ISO-8859-1?Q?a?==?ISO-8859-1?Q?b?= c =?ISO-8859-1?Q?d?=
in order to deobfuscate it to
ab c d
My biggest question is, why?
It seems to me that if you ever see anything like this, it is not
going to be mail you want to keep. Me, I don't bother to try to read
what spammers send me. If it's spam, it goes in the spam pile.
The quoted-printable obfuscation can be used to change "iloveyou.exe"
into "iloveyou.e=?charset?q?x?=
=?charset?q?e?="
exactly. This method can not only be used to obfuscate the file names
of attachments, it has in fact already been used. The first time I
received such a (virus infected) mail is now months back. That's why
my procmail anti-virus filter decodes "Base64" and "Quoted-Printable"
encoded file names. The current version is available at
http://prdownloads.sourceforge.net/softlabsav/SoftlabsAV-0.8.2.tar.bz2?download
In the next version, I want to exceed this deobfuscation to the
subject.
A recipe for: if more than 1 charset is used in any header, than it
must be garbage.
bq_regex = '=\?[-a-z0-9]+\?[bq]\?[^?]*\?='
:0
*$ ()${bq_regex}.*${bq_regex}
--IN.garbage.bq/
Instead of [-a-z0-9]+ I am using
av_CHARSET = "([a-z]+[_-]?[a-z0-9-]+)"
# covers encodings like iso-8859-15 iso-8859-8-i us-ascii gb2312 shift_jis
Because the header-field-names are never encoded, the condition could
also be written as:
*$ ^[^:]+:.*${bq_regex}.*${bq_regex}
And to limit it to specific headers:
bq_headers = '(Subject|From|To|Cc)
bq_regex = '=\?[-a-z0-9]+\?[bq]\?[^?]*\?='
:0
*$ ^${bq_headers}:.*${regex_bq}.*${regex_bq}
--IN.garbage.bq/
a nice spam test!
rob.
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail