procmail
[Top] [All Lists]

Re: maintaining a list that procmail can check

2001-07-06 03:55:49

I started to write this and then got tied up in some other projects I need to get done (you know, WORK), so this isn't as detailed and rambling as I might otherwise like to provide. Hopefully, my mumblings might prove useful.

At 17:29 2001-07-05 -0500, David W. Tamkin wrote:

So I'm looking for the setup with the most net efficiency for procmail to

(1) check whether a name is listed

grep ^name$ file

The grep regexp anchors ensure that the WHOLE string is the ONLY thing on the match line.

(2) add a name to the list

echo name >> file

and
(3) remove a name from the list

ugly:
grep -v ^text$ file > file.tmp;mv file.tmp file

(if using sed, the process would be similar)

I figure that a flat file with one name per line is not so good; procmail
would have to call grep every time and call an editor to remove a name,
though adding a name should have little overhead by just calling cat to
append (but then again that also needs a shell).

Life sucks.

but adding would require calling an editor, and removing would require not
only calling an editor but giving special handling to the bottommost entry
because removing it would mean taking the pipe symbol off the entry above it.

If you could always ensure that the last entry is something that would always resolve as false, then you could delete the other lines at will without having to special case the removal of anything - and anything added would always have an '|\' tacked onto the end. Coincidentally, the always-false expression would serve as the insert-before-anchor for the insertion point for when you're adding new lines.

even when there is a /bin/test, it's a shell script that calls the shell's
`test' built-in].

Why not just write a small c program and call that?

someprogram ADD name
someprogram TEST name
someprogram RM name

Or so. TEST would return an errorlevel, making it useable directly in place of say, grep (if you're matching a non-regexp text), and the other two do just what they're supposed to.

Arguably, ADD and TEST functionality could be combined (if your process runs like "if it isn't there, we'll be adding it") -- simply return a status as to whether it was there to begin with or not (and if so, there is obviously no need to append a line to the file). Then you're only invoking the external process once. You might do is similar for the RM process, though the logic flows different there (more efficient if it isn't a burden to read the whole file into memory and output it only if you have a match and need to update it).

That gives you ONE program you call, which should have an overhead not much different than calling rm or ln, and certainly less clutter if you have hundreds or thousands of entries...

I use something similar for handling a simple email list management process. That program actually parses the message body and checks in another file for a password and for the related filename for the address list. Still, it isn't a process I figured would be handled all that elegantly via "roughing it", and writing a few lines of code to do a task isn't giving in.

Similarly, I have "megagrep", an AVL multiple header grepping tool (sans regexps) which takes a HUGE list of items, loads them into memory (as an AVL), and then is called on each inbound message, where it checks the message headers, breaking them down into component parts and searching the AVL tree for matches. If I were still using grep for this purpose (which at one point, I was), I couldn't hope to achieve the results in a similar time span or memory expense (grep would REALLY pig out on memory with the options which were necessary).

Anyway, if you want it to be efficient, perhaps writing a tool to support your script is the way to go...

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail