procmail
[Top] [All Lists]

Re: character set spec bypassing filter?

2002-11-07 17:46:34
in message 
<Pine(_dot_)LNX(_dot_)4(_dot_)44(_dot_)0211072252190(_dot_)21735-100000(_at_)mundungus(_dot_)clifford(_dot_)ac>,
wrote Alan Clifford thusly...

I look for non-ascii chars in the subject

# 5% gagabuggee subject
# avoid empty subject
:0
* ^Subject: \/.+
{
  :0 D
  * -1^1 MATCH ?? .
  *  2^1 MATCH ?? =[0-9A-F][0-9A-F]
  * 20^1 MATCH ?? [ ¡¢£?¥?§?©ª«¬­®¯°±²³?µ¶·?¹º»???¿]
  * 20^1 MATCH ?? [ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß]
  * 20^1 MATCH ?? [àáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ]
  * 20^1 MATCH ?? =[A-F][0-9A-F]
  {
    # action

say, alan, instead of checking for characters that aren't ASCII, why
don't you check for characters outside the range of ASCII
characters?

  #  UNTESTED; adjust action & scores as desired
  :0
  * ^Subject:[ ]*\/[^ ]+.*
  {
    :0 D
    * -1^1 MATCH  ??  .
    *  2^1 MATCH  ??  ()=[0-9A-F][0-9A-F]
      #
      #  replace '\t' w/ actual tab
    * 20^1 MATCH  ??  ()[^a-zA-Z0-9 \t]
      #
      #  some characters may or may need to be escaped in the 2d &
      #  3d character class specification ( only if procmail
      #  supported symbolic character classes, POSIX style ):
    * 20^1 MATCH  ??  ()[^-^~!(_at_)#$%^&*_+`=\\|':;",./?]
    * 20^1 MATCH  ??  ()[^\[\]{}()<>]
      #
    * 20^1 MATCH  ??  ()=[A-F][0-9A-F]
    possibly-non-ascii-subject

      :0 E:
      supposedly-clean-subject
  }
    :0 E:
    no-subject


  - parv

-- 


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail