procmail
[Top] [All Lists]

Re: "Atomic" updates to the LOGFILE...

2002-07-26 10:25:32
On 26 Jul, Philip Guenther wrote:
| Michael J Wise <mjwise(_at_)kapu(_dot_)net> writes:
| ...
| <description of kludge to atomicly add entries to the procmail log>
| ...
| >We'd like something a bit more economical in terms of processor
| >performance, as we are trying to run some scripts based on the logfiles
| >to report the 'Performance' of our SpamTraps.
| 
| What are you extracting from the logs that you need to guarantee to
| be uninterwoven?  Procmail line buffers the logs**, so unless you are
| matching up items logged on different lines, you don't really need
| atomic writes.
| 
| [...]
| 
| >And ideas?
| >Or are we just behind the times and in need of updating our software?
| 
| Can you explicitly (extract and) log the data you need for you analysis
| in a single line?  If so, that should provide all the guarantee you need.
| 
| [...]
| 
| ** Well, kinda.  It flushes its buffer if the last character logged is
| a newline or if you log zero characters.  The 'abstract' logged when it
| delivers a message is logged as three separate lines.

Philip's suggestion to log a single line, and Bart Schaefer's idea to
leave concatenation to a different and less frequent process are
probably better than what follows.  But if Michael wants logging in one
reasonably human-readable logfile, maybe this can work.

I assign a LOGPFX variable that includes PID ($$) for each message.
Every LOG="..." is then LOG="$LOGPFX ...".  It appears that the log
writes Michael does are via LOG="...", except for the LOGABSTRACT.  He
could prepend a unique prefix to each log write, and create a custom
LOGABSTRACT that did the same, then it wouldn't matter if log lines
from different messages are interspersed.  If it's important to
keep error messages and "procmail: ..." log entries ordered, then this
won't work perfectly. But it seems that's not an issue here.

If, as Bart remarked, pids rollover and are not suitably unique, then
the date and time could be extracted from the envelope.  Something like:

wsplus='[       ]+'
MONTHS='(J(an|u[ln])|Feb|Ma[ry]|A(pr|ug)|Sep|Oct|Nov|Dec)'
MONTH2NUM="Jan:01:Feb:02:Mar:03:Apr:04:May:05:Jun:06:Jul:07:Aug:08:Sep:09\
:Oct:10:Nov:11:Dec:12"
E_STAMP="$MONTHS${wsplus}([ 0][1-9]|[12][0-9]|3[01])${wsplus}\
([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]"

:0
* $ ^^From .*\/$E_STAMP
{
  E_STAMP = "$MATCH"
  :0
  * E_STAMP ?? ()\/([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]
  { E_TIME = $MATCH }
  :0
  * $ E_STAMP ?? ()\/$MONTHS
  * $ MONTH2NUM ?? $MATCH:\/[0-9]+
  { E_MON = $MATCH }
  :0
  * $ E_STAMP ?? ()$MONTHS${wsplus}\/([ 0][1-9]|[12][0-9]|3[01])
  { E_DAY = "$MATCH" }

  LOGPFX = "$E_MON$E_DAY $E_TIME $$"
}

LOGPFX will then look something like "0725 17:16:11 24002", which
should be suitably unique, and will sort "properly" too. Depending on
how often logs are parsed and statistics tabulated, the month and day
stuff might not even be necessary.

Prepending this to each log write, and doing his own "custom"
LOGABSTRACT, might fit Michael's bill and allow him to do away with the
TRAP kludge. Of course this all depends on my understanding of Philip's
note on line buffering to mean that independent log writes can't clobber
one another (which I've never seen). Otherwise ... never mind.

-- 
Reply to list please, or append "8" to "procmail" in address if you must.
Spammers' unrelenting address harvesting forces me to this...reluctantly.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail