procmail
[Top] [All Lists]

Re: subject processing

1997-07-07 09:45:00
era eriksson <era(_at_)iki(_dot_)fi> wrote:
  :0
  * ^\/Subject:[      ]*((re|fw|sv|betr|antw)( ?[[][0-9]+])?[-:>][    ]*)+
  {
      MMATCH="$\MATCH"
      :0fhw
      * MMATCH ?? ^^\(\)\/.*
      | sed "/^$MATCH[        ]*/Subject: Re: /
  }

You only have a tab in that sed script, and you are missing some stuff
like a close quote.

   :0
   * ^\/Subject:[       ]*((re|fw|sv|betr|antw)( ?[[][0-9]+])?([-:]|>+)[        
]*)+
   {
       MMATCH="$\MATCH"
       :0
       * MMATCH ?? ^^\(\)\/.*
       { SUBPAT=$MATCH }

       :0fhw
       * SUBPAT ?? \<re\>
       | sed "s/^$SUBPAT[       ]*/Subject: Re: /"

       :0Efhw
       | sed "s/^$SUBPAT[       ]*/Subject: /"
   }

I liked David's suggestion, but it will be somewhat expensive in
pathological cases. This should consume only one external process, no
matter what. (The Perl solution posted by Eli might be almost as good
-- test this if you really want to know :^)

The above is better.

  This still leaves leading whitespace of the actual Subject out of
the MATCH, which is something one ought to fix before putting this to
real use, but that shouldn't be too hard. (Can anyone spot the error?)

I'm not sure what you are asking. Are you talking about the case of
something like "Subject:                Real Subject"?

I fixed it temporarily by adding trailing whitespace to the sed script.

Probably not.

  There's also the case of Re>>> which should be covered as well, but
that, too, should be easy to add.

Fixed in mine.

There still is the malignant case of stuff like:

        Subject: Re[32]: FW: Re: Re[15]: Sv: Re[9]:
                Re: Fw: Real Subject

That won't be easy to fix with a sed script. I don't know awk well
enough to speculate on using that. I could tweak my perl script to
handle it, though.

Oh, and as a public answer to the question you asked about sizes:

    SIZE    RSS     COMMAND         (version)
     984    372     procmail        3.10
    1488    592     perl            5.004
    1644    768     procmail        3.11pre7 with perlembed
How would sed score here? 

        SIZE    RSS     COMMAND         (version)
         916    308     sed -e p        GNU 2.05

Unlike the other programs sed requires some command line arguments
before it is willing to sit and wait for input. 'p' is the shortest
valid sed script I could think of. :^)

Elijah
------
Please do not CC me when replying to the list.  It is not my responsibility to
prove to you my mail is not spam, if mail to you bounces it will not be resent.

<Prev in Thread] Current Thread [Next in Thread>