procmail
[Top] [All Lists]

Re: Matching repeating lines?

1997-02-10 17:06:57

I receive log information from a large number of computers.  

Syslog on some machines is broken, and instead of adding a note to the 
effect of "previous message repeated 271 times", it will include all 271 
seperate log messages.

I don't need to see all of these.

What's the most efficient way of filtering lines that are repeated more 
than <n> times?  If I was trying to match specific text, this would be 
trivial, but as I'm trying to match *any* repeated text I'm not sure how 
to proceed.

-- Lars

        This is obviously not a procmail question.
        However I'll give you a hint in awk:

        #! /usr/local/bin/gawk -f
                ## rmsyslogdupes.awk
                ## by James T. Dennis (jim(_at_)starshine(_dot_)org)
                ## 
                ## remove dupe messages in syslog files

        # get the part of the line past the date:
        # initialize l (lastline) for first line:
        NR==1 { $1 =""; $2=""; $3=""; l=$0 } 

        # use the same technique again to strip the date/time stamp
        # compare l with what's left of $0 and skip to next line
        # if they're the same -- (else)|(in any event) set new l
        NR > 1 { $1 =""; $2=""; $3=""; 
                if (l==$0) { next } 
                l=$0
                } 
        # print anything we didn't skip (including that first line)
        { print }

        This script is not tested -- just something off the cuff
        -- however the principle should work.

--
Jim Dennis,                                
info(_at_)mail(_dot_)starshine(_dot_)org
Proprietor,                          
consulting(_at_)mail(_dot_)starshine(_dot_)org
Starshine Technical Services              http://www.starshine.org

<Prev in Thread] Current Thread [Next in Thread>