nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] MH-W intro/help request

2014-12-02 12:11:59
Ken Hornstein writes:

What I use now is "mark -list -sequence unseen", which returns compressed
lists of messages (i.e. "1-5,7-8" instead of "1 2 3 4 5 7 8").  Parsing
this to intersect it with my pick output is relatively fast, though it
is of course ineligant compared to getting *just* the list you want.  It
is also surprisingly slow (like 1/7th of a second to get this vs. other
MH programs which run 10x or more faster).  I don't really understand why
it is so slow, since it is a near character-for-character copy of the
one line that the .mh_sequences file has in it.  If it wasn't so very
slow compared to the other MH programs, I probably would not have even
brought it up for now.

You know, I just took a look at it; it should not be slow, actually,
unless you're running into things like lock contention for the sequence
file.  It does the same things every other MH program does (it calls
folder_read()) but then it does very little after that.  Could you
do a system call trace and try to figure out what's taking so long?

OK, I hadn't really tried to debug it before, but after this comment
I ran some tests.  What I found is that it is not just mark, but
it seems related to the large folder I was testing it on, and some
other MH programs get slow when run under exactly the same circumstances.

Specifically, I was testing it on a very large folder of approx 100K
messages.  Both the "mark ..." and a "show N" invokation take about
the more than 1/10th of a second on average, even for extremely short
outputs.

So:
        time show 99948 > /dev/null
                real    0m0.120s
                user    0m0.040s
                sys     0m0.068s
        time mark -list -sequence unseen > /dev/null
                real    0m0.123s
                user    0m0.060s
                sys     0m0.060s

The actual output of mark is (just so you know it is very tiny)
        unseen: 99898-99907 99912-99947 99949-99959

The show command is for a minimally small email.

Using "cat" on the same file that takes show more than 1/10th of a
second is comparatively absurdly fast:
        time cat Mail/inbox-old/99948 > /dev/null
                real    0m0.002s
                user    0m0.000s
                sys     0m0.000s

"mark" run on a small folder is quite fast.


But it does seem Weird that it would take so long for these small
operations on a large folder.  Is the "folder_read()" troublesome here?


--
    Erich Stefan Boleyn     <erich(_at_)uruk(_dot_)org>     http://www.uruk.org/
"Reality is truly stranger than fiction; Probably why fiction is so popular"

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>