nmh-workers
[Top] [All Lists]

Re: Sort and delete duplcate messages

2020-05-04 03:55:51
Hi,

Ken wrote:
I know that 'sortm -textfield Subject' will sort messages accoring
to the subject field. Having run that command, is there a way to
then delete the first duplicate of each message in the list such
that if 1 and 2 are duplicates and 6 and 7 are duplicates you would
delete messages 2 and 7 leaving 1 and 6?

I want to say you could do something with piping the output of scan
into "uniq -d -f <num>".  Might require a custom scan format, but that
seems relatively simple.

Hm, a quick test:

% scan -format '%(msg) %{subject}' | uniq -d -f 1

suggests that it prints the first one, not later ones, so that isn't
exactly what you want.  Might be a good starting point, though?  You
could probably do something with uniq -c and pipe that to an awk
script that did what you wanted.

awk's probably easiest, after deciding what counts as an equivalent
subject field.

    $ ls
    1  2  3  4
    $ sed -n l *
    subject: foo bar$
    subject: foo$
     bar$
    subject: xyzzy $
    subject: fo=?utf-8?Q?=6f?= bar$
    $
    $ scan -width 0 -format '%(decode{subject}):%{subject}:%(putlit{subject}):' 
+.
    foo bar:foo bar: foo bar:
    foo bar:foo bar: foo
     bar:
    xyzzy:xyzzy: xyzzy:
    foo bar:fo=?utf-8?Q?=6f?= bar: fo=?utf-8?Q?=6f?= bar:
    $
    $ scan -width 0 -format '%(msg) %(decode{subject})' +.
    1 foo bar
    2 foo bar
    3 xyzzy
    4 foo bar
    $
    $ scan -width 0 -format '%(msg) %(decode{subject})' +. |
    > awk '{m=$1; sub(/[^ ]* /, "", $0)} NR>1 && $0==l {print m} {l=$0}'
    2
    $

-- 
Cheers, Ralph.

<Prev in Thread] Current Thread [Next in Thread>