procmail
[Top] [All Lists]

Regular expression syntax and testing (was Re: Invalid message-ids)

1997-08-12 02:58:00
On Mon, 11 Aug 1997 10:38:34 -0700, lists(_at_)professional(_dot_)org
(Professional Software Engineering - Lists account) wrote:
At 11:02 AM 8/11/97 -0400, Stan Ryckman wrote:
Timothy J Luoma <luomat+procmail(_at_)luomat(_dot_)peak(_dot_)org> wrote:
:0:
* ! ^Message-ID:\ \<\>
This would act basically as Era said,

Silly me, I only tested \<\> against "..", which didn't match. It
would appear that they do match against "<>". See other messages in
this thread. (The manual is apparently not strictly true to reality
here. Can somebody [Philip? :^] grok the source?)

Should anyone wonder if a ruleset would work, don't forget that you can put
the ruleset into a standalone recipe file, and redirect a message into it:
procmail bogusid.rc < sometestmessage
<...>
:0:
Test_NotSPAM

If something improbably happened (such as running this in a directory
you didn't have write permission to), you could end up falling off the
end of your bogus.rc and delivering to your main mail spool. Not good,
especially if you were happening to test with random strings in no
particular format instead of valid mail messages.
  I often do this just to test how Procmail matches some regular
expression; 

 $ procmail ./.prc
Hello world
^D
 <... procmail VERBOSE log output to stderr ...>

and have a wrapper set up which, among other things, sets MAILDIR and
DEFAULT to somewhere reasonably far away from my real mailbox. 

I haven't done it, but I presume you could use formail to extract a couple
of your mailboxes to redirect a bunch of standard mail into the test recipe
in this fashion, which would allow you to verify that none of your regular
mail (as defined by mail you've already recieved) would have been
misdirected by the recipe.  After extracting all your regular mail, you'd

Certainly -- you could even do this: 

 $ cat > test.rc
MAILDIR=/tmp
DEFAULT=/dev/null
SHELL=/bin/sh           # I got bitten by this a few times too many. [*]

:0
* ! ^Message-Id: <[^>]+>
|
^D
 $ grep -i ^Message-id Mail/* | procmail test.rc

and if you get output, you have a match (forgetting for a moment that
technically, in principle, the data of the Message-Id: field [okay,
record, in this context -- I agree with David here] could be on the
next line or whatever).

this particular case, these escaped characters ARE special to Procmail.

I guess a bit of historical perspective helps. Presumably the original
regular expression engines only regarded ^ . * [ ] $ (and \ of course
for the escape mechanism) as special, and when the syntax was
extended, the extensions were backslashed because you wanted
reasonable backwards compatibility. (GNU grep, for example, wants
backslashes even on + and ? as well as the truly "extended" { } ( ) |.)

Check the man pages.

Yes.

/* era */

[*] For some stupid reason, tcsh is the only allowed login shell here.
    Yeah, you bet I have exec /usr/local/bin/bash in my .login, but
    Procmail doesn't see that. (Sheesh, you can't even have /bin/sh!)
-- 
Defin-i-t-e-ly. Sep-a-r-a-te. Gram-m-a-r.  <http://www.iki.fi/~era/>
 * Enjoy receiving spam? Register at <http://www.iki.fi/~era/spam.html>

<Prev in Thread] Current Thread [Next in Thread>