On Sat, Oct 23, 2004 at 09:55:11AM -0500, David W. Tamkin wrote:
Does
/=\?[^? ][^? ]*\?[BbQq]\?[^?]*\?=/
work any better? You can omit the backslashes before the question
marks, but if you're used to extended regular expressions, the result
will look funny to you.
I don't know if that's good or not; but an appraoch with base64-decode seems
promising to me. Here it is on whatever of my recent spam has encoded
Subject lines, as a test. (I don't have any legit mail of that sort at
the moment.)
First, let's see which candidates are there (out of my last-100 spam pile):
[217.86.0.207 -> panix5] {dman} [0.76]
6:11pm [~/Mail/.myspam] 292[0]> grep -i '^Subject:.*?=' *
msg.0lvB:Subject: =?utf-8?q?Rolex, Cartier, Fran?=
msg.JGCN:Subject: =?iso-8859-1?b?Q2hlYXAgTGV2aXRyJWEgQ2lhbGklcyBhbmQgTW9yZSE=?=
msg.RGCN:Subject:
=?iso-8859-1?B?R2V0IHlvdXIgcHJlcyFjcmlwdGlvbiBkZWxpdmVyZWQgdG8geW91ciBkb29yISAgICAgOGo=?=
msg.VGCN:Subject: =?ISO-8859-1?B?RmFzdGVzdCBEZWxpdmVyeSBOYXRpb253aWRlICAgICBx?=
msg.WGCN:Subject: =?utf-8?q?All remedy at 0.22$ ?=
msg.XGCN:Subject:
=?ISO-8859-1?b?WW91ciBvbmUgc3RvcCBwcmVzY3JpcHQqaW9ucyAgICAgNQ==?=
msg.YGCN:Subject:
=?iso-8859-1?B?U2hpcHBlZCB0byB5b3UgbmV4dCBkYXkgdG8geW91ciBkb29yICAgICBq?=
msg.j9LE:Subject: =?iso-8859-1?B?Q2hlYXAgTGV2I2l0cmEgQyVpYWxpcyBhbmQgTW9yZSE=?=
msg.m33R:Subject: =?iso-8859-1?B?VGltZSBtYXkgYmUgcnVubmluZyBvdXQu?=
msg.nlvB:Subject:
=?utf-8?B?TmVlZCBsb3cgcHJpY2VkIHNvZnR3YXJlPyAgQmF0aHVyc3QgZ2VydW5k?=
msg.olvB:Subject:
=?utf-8?B?RndkOiBbZ2xvc3Nlc10gNjElLW9mZiBWaWNvZGluLiAgQ2Flc2FyaXplIGV4cGxhbmF0b3J5?=
msg.v33R:Subject: =?ISO-8859-1?B?VGhlIEJlc3QgRGVhbHMgQW55d2hlcmUh?=
msg.x33R:Subject: =?ISO-8859-1?b?WW91IGhhdmUgYmVlbiBQcmUtQXBwcm92ZWQh?=
All right, now let's parse those with gsed (for the case-insensitive switch)
and base64-decode:
[217.86.0.207 -> panix5] {dman} [0.54]
6:11pm [~/Mail/.myspam] 293[0]> foreach s ( `grep -i -l '^Subject:.*?=' *` )
foreach? cat $s | gsed -n '/^Subject: */I { s///; s/=[?][^?]*[?][BQ][?]//I; p;
}' | base64-decode; echo ""
foreach? end
F?^Ä&«¶'«¶§
Cheap Levitr%a Ciali%s and More!
Get your pres!cription delivered to your door! 8j
Fastest Delivery Nationwide q
Ykzg?É«tÛ
Your one stop prescript*ions 5
Shipped to you next day to your door j
Cheap Lev#itra C%ialis and More!
Time may be running out.
Need low priced software? Bathurst gerund
Fwd: [glosses] 61%-off Vicodin. Caesarize explanatory
The Best Deals Anywhere!
You have been Pre-Approved!
-------------------------------------------
So most of them worked okay.
Not sure how that would work on the German diacriticals -- maybe it wouldn't
work.
--
dman
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail