procmail
[Top] [All Lists]

Re: ISO something in Subject

2004-10-23 09:28:02
On Sat, Oct 23, 2004 at 09:55:11AM -0500, David W. Tamkin wrote:

Does

/=\?[^? ][^? ]*\?[BbQq]\?[^?]*\?=/

work any better?  You can omit the backslashes before the question 
marks, but if you're used to extended regular expressions, the result 
will look funny to you.

I don't know if that's good or not; but an appraoch with base64-decode seems
promising to me.  Here it is on whatever of my recent spam has encoded
Subject lines, as a test.  (I don't have any legit mail of that sort at
the moment.)

First, let's see which candidates are there (out of my last-100 spam pile):

[217.86.0.207 -> panix5] {dman} [0.76]
 6:11pm [~/Mail/.myspam] 292[0]> grep -i '^Subject:.*?=' *   
msg.0lvB:Subject: =?utf-8?q?Rolex, Cartier, Fran?=
msg.JGCN:Subject: =?iso-8859-1?b?Q2hlYXAgTGV2aXRyJWEgQ2lhbGklcyBhbmQgTW9yZSE=?=
msg.RGCN:Subject: 
=?iso-8859-1?B?R2V0IHlvdXIgcHJlcyFjcmlwdGlvbiBkZWxpdmVyZWQgdG8geW91ciBkb29yISAgICAgOGo=?=
msg.VGCN:Subject: =?ISO-8859-1?B?RmFzdGVzdCBEZWxpdmVyeSBOYXRpb253aWRlICAgICBx?=
msg.WGCN:Subject: =?utf-8?q?All remedy at 0.22$ ?=
msg.XGCN:Subject: 
=?ISO-8859-1?b?WW91ciBvbmUgc3RvcCBwcmVzY3JpcHQqaW9ucyAgICAgNQ==?=
msg.YGCN:Subject: 
=?iso-8859-1?B?U2hpcHBlZCB0byB5b3UgbmV4dCBkYXkgdG8geW91ciBkb29yICAgICBq?=
msg.j9LE:Subject: =?iso-8859-1?B?Q2hlYXAgTGV2I2l0cmEgQyVpYWxpcyBhbmQgTW9yZSE=?=
msg.m33R:Subject: =?iso-8859-1?B?VGltZSBtYXkgYmUgcnVubmluZyBvdXQu?=
msg.nlvB:Subject: 
=?utf-8?B?TmVlZCBsb3cgcHJpY2VkIHNvZnR3YXJlPyAgQmF0aHVyc3QgZ2VydW5k?=
msg.olvB:Subject: 
=?utf-8?B?RndkOiBbZ2xvc3Nlc10gNjElLW9mZiBWaWNvZGluLiAgQ2Flc2FyaXplIGV4cGxhbmF0b3J5?=
msg.v33R:Subject: =?ISO-8859-1?B?VGhlIEJlc3QgRGVhbHMgQW55d2hlcmUh?=
msg.x33R:Subject: =?ISO-8859-1?b?WW91IGhhdmUgYmVlbiBQcmUtQXBwcm92ZWQh?=

All right, now let's parse those with gsed (for the case-insensitive switch) 
and base64-decode:

[217.86.0.207 -> panix5] {dman} [0.54]
 6:11pm [~/Mail/.myspam] 293[0]> foreach s ( `grep -i -l '^Subject:.*?=' *` )
foreach? cat $s | gsed -n '/^Subject: */I { s///; s/=[?][^?]*[?][BQ][?]//I; p; 
}' | base64-decode; echo ""
foreach? end
F?^Ä&«¶'«¶§
Cheap Levitr%a Ciali%s and More!
Get your pres!cription delivered to your door!     8j
Fastest Delivery Nationwide     q
Ykzg?É«tÛ
Your one stop prescript*ions     5
Shipped to you next day to your door     j
Cheap Lev#itra C%ialis and More!
Time may be running out.
Need low priced software?  Bathurst gerund
Fwd: [glosses] 61%-off Vicodin.  Caesarize explanatory
The Best Deals Anywhere!
You have been Pre-Approved!
-------------------------------------------

So most of them worked okay.
Not sure how that would work on the German diacriticals -- maybe it wouldn't
work.

-- 
dman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail


<Prev in Thread] Current Thread [Next in Thread>