nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] Pessimal Optimizations.

2012-12-11 20:04:32
Ultimately, though, mmap() is just a micro-optimization in the context
of a *822/MIME parser.  The effort that would be expended on mmap() I/O
would be better spent on writing a bullet-proof parser. No matter what,
you will end up copying data to and from user space on the way to its
eventual display.  The solution here is to write a good one-pass MIME
parser that can collect the structure of the message as it's read in to
memory.  In many cases, once you hit the body you really can just parse
as you go.

So, let me speak to that, since I essentially did that with replyfilter.

replyfilter is called inside of mhl as a filter; it takes the body of the
message on standard input and outputs something that is thinks is suitable
to go in a reply message on standard out.  For various dumb reasons it
only gets the body of the message (the headers it needs to parse the message
are on the command line).

So, think about the data flow for a second. mhl reads the body of the
message in (using m_getfld()!) and then writes it to a pipe. replyfilter
reads it from the pipe and then processes it.  So we're talking about
a stdio read, a copy from the stdio buffer to the mhl input buffer, then
write() results in a copy from the mhl input buffer to the kernel, and then
there is a copy from the kernel to replyfilter's input (not sure if that's
using stdio there).  So that's at least 4 copies.  I tested replyfilter out
with files in the tens of megabytes and I didn't notice any delay (the
bulk of the messages were attachments that replyfilter didn't output, but
it still had to read the whole message).  So yeah, that would have bogged
down 20 years ago, but not today.

Someone mentioned FreeBSD (I think) and cp using mmap() for copies.

That was Paul Vixie, and I was curious so I took a look at that.  Their
cp _only_ does that for files less than 8 megabytes.  The comments for
that code also says, "This is really a minor hack, but it wins some
CPU back".  It also notes that some filesystems don't implement mmap(),
so you'd need traditional code as well.

So ... can we all agree this isn't worth the trouble and forget it?

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>