procmail
[Top] [All Lists]

Re: Matching repeating lines?

1997-02-10 14:32:01
On Feb 10,  1:48pm -0500, Lars Kellogg-Stedman <lars(_at_)bu(_dot_)edu> wrote:
:What's the most efficient way of filtering lines that are repeated more
:than <n> times?

If you're willing to settle for n=1, try the "uniq" command (but don't

That would work, except that the lines aren't identical -- each one has a 
timestamp.  Uniq won't prune these lines out of the message because 
they're not really identical.

For anyone following this thread:  a syslog entry looks something like:

Feb  9 10:09:53 hostname <some sort of message text>

Modern uniqs can skip leading characters or fields for just this reason.

See the +n and -n flags in the Sun version, the -f and -s flags in the
SGI version, etc...

The SGI man page summarizes:

         The -f and -s options specify skipping an initial portion of
         each line in the comparison:

         -f fields
                 The first fields fields together with any blanks
                 preceding them are ignored for each input line.  A
                 field is defined as a string of non-space, non-tab
                 characters separated by tabs and spaces from its
                 neighbors.

         -s chars
                 The first chars characters (columns, in multibyte
                 environments) are ignored.  Fields are skipped before
                 characters.

    NOTES
         The -n and +m options, although still recognized, are now
         obsolete and may not be supported in future releases.  The -n
         option is equivalent to -f fields with fields set to n.  The +m
         option is equivalent to -s chars with chars set to m.


== Bob

<Prev in Thread] Current Thread [Next in Thread>