:What's the most efficient way of filtering lines that are repeated more
:than <n> times?
If you're willing to settle for n=1, try the "uniq" command (but don't
That would work, except that the lines aren't identical -- each one has a
timestamp. Uniq won't prune these lines out of the message because
they're not really identical.
For anyone following this thread: a syslog entry looks something like:
Feb 9 10:09:53 hostname <some sort of message text>
I want to prune out lines with duplicate messages -- even if they have
different timestamps. I suppose something like this would be possible in
perl, but I'm not at all familiar with perl. Any hints?
Using standard shell tools, I suppose I could run through the file line
by line, using a temp file (let's call it TMPFILE) and for each line:
(1) Grab the message portion of the entry
(2) check to see if it's in TMPFILE
(3) If not, output the line and add the message to TMPFILE
(4) Continue with next line
But this is ugly. Someone suggested running something like this on the
log files themselves, before they're mailed out, but I'm hesitant to edit
the originals (I *might* want that information someday), and furthermore
I'm not the only person receiving the log output.
-- Lars
---
Lars Kellogg-Stedman * lars(_at_)bu(_dot_)edu * (617)353-8277
Office of Information Technology, Boston University