procmail
[Top] [All Lists]

Re: questions about ^TO_ and MATCH

1999-12-03 23:05:29
At 09:09 PM 12/3/99 -0500, Nancy this-address-is-valid McGough wrote:

[snip]
To: "Framers (E-mail)" <framers(_at_)FrameUsers(_dot_)com>
[snip; please pardon any wrapping damage in the following]
:0:
* 
^TO_\/(procmail|pine-info|vim|copyediting-l|techwr-l|cygwin|framers|veg-nyc|faq-maintainers|pine-alpha)
in-l-$MATCH

And discovered that a msg with the above To header was put into a
folder named:

in-l-Framers

which means that ^TO_ is looking at a comment (the part inside
the quotation marks). Shouldn't it just be looking at the part in
the angle brackets?

I'll pass on detailed comments on ^TO_ (others here can do it better,
and I never use it) but note that it's just a shorthand (macro), and
also that email addresses do not necessarily contain angle brackets.
        To: Framers
might even be valid if it's an internal address, and some mailers
(notably elm) still emit the perfectly legal form:
        To: user(_at_)domain(_dot_)com (personal name)

Something like:
    * ^TO_[^<]*<\/[^>]+
(untested, hence prone to typos) will extract what's between angle
brackets if you're sure there's exactly one pair.  You can of course
also restrict the match to your long string of ORed matches as well.

[snip]

I'm thinking about adding an @ sign at the end of the condition
so it would look like this:

:0:
* 
^TO_\/(procmail|pine-info|vim|copyediting-l|techwr-l|cygwin|framers|veg-nyc|faq-maintainers|pine-alpha)@
in-l-$MATCH

so that it's more likely that it will actually match on an email
address rather than a comment. But I don't really want my file
names to end with an @ so I'm wondering what's the best way to
truncate the last letter of the $MATCH and does that defeat some
of the speed-saving I'm getting by processing all my mailing
lists with this one recipe.

To retain up to, but not including, the first @ character:
    :0
    * MATCH ?? ()\/[^(_at_)]+
    { }

If I remember correctly, I think that in older versions of
procmail, MATCH may get cleared too early in the processing
of that, and you may have to write instead:
    DUMMY = $MATCH
    :0
    * DUMMY ?? ()\/[^(_at_)]+
    { }

[another snip]

So I'd like to convert the MATCH to all lower case
and I'm wondering 1) what's the best way to do this and 2) does
this defeat a lot of the speed up?

That was very recently posted here; I'd like to give credit but
unfortunately I've deleted my copy.  It was something like:
    # where VAR contains the mixed-case string
    :0D
    * VAR ?? [A-Z]
    { VAR=` (something which properly invokes tr goes here) ` }
This only runs the tr process when there is an upper case character
needing conversion to lower case.

Probably I will separate the
troublesome lists into a separate recipe so that all the nicely
behaved lists (like the procmail list :-) are handled by the
efficient recipe.

My impression is that starting up processes generally dwarfs any
CPU time to process header matches within procmail (remember,
they stay in memory; the mail is not re-read).

Anyway, why do you think the procmail list is "nicely behaved?" :-)

Thanks for any tips about all this,
Nancy

Hope that helps at least some,
Stan

<Prev in Thread] Current Thread [Next in Thread>