ietf-smtp
[Top] [All Lists]

Re: Mail Data termination

2011-08-21 05:24:05
On 2011-08-20 23:18:15 -0700, Murray S. Kucherawy wrote:
From: Paul Smith [mailto:paul(_at_)pscs(_dot_)co(_dot_)uk]
Sent: Saturday, August 20, 2011 12:41 PM

I'm with Hector on this. I really don't like the idea of a sender
keeping a connection open 'just in case'.

If it keeps it open for 10 seconds 'just in case', and no new mail
arrives so it has to close the connection, what has that achieved other
than extra load on the receiver? Why wouldn't it then say 'well, another
message may arrive in the next few seconds, so I'll keep it open just a
bit longer', and so on.

I don't see how keeping a connection open creates extra load in a
modern operating system.  An open file descriptor is just an entry in
a kernel table.

Not quite - there is also at least the per-connection data in the MTA -
this may be only a small data structure or a whole process.

I've never seen an idle connection cause distress to a running system.

We've run into the connection limit a number of times on our MXs, but
that's because I've set that too conservatively (there was still plenty
of free RAM when the limit was reached).

Even for an application that creates a complete subprocess to handle
that idle connection, the process would eventually get paged or
swapped out in favour of more active processes.

We are talking about a few seconds here: Postfix has a default limit of
2 seconds, 5 seconds also seems to be a rather popular limit (see below).
I think if your MTA processes are being swapped out within 2-5 seconds
of idle time you should probably invest in more RAM (or reduce the
connection limit).

There are also clients (mostly bots, I think) which don't
disconnect at all and just hang around until the server disconnects
them. These might be swapped out, and they could be a problem because
there are potentially a lot of them.


I'm really surprised there's so much sudden consternation about this
feature given how many MTAs have it, and the fact that it's been
around for well over a decade.  Somehow, in that time, the sky hasn't
fallen.

The consternation is because Hector misread his logfiles (reading 5
minutes instead of 5 seconds) and then didn't want to back down.

I just computed a few statistics over the log files of the last week of
our lowest-numbered MX:

Average session duration is 6.7 seconds, Average delay between the reply
code to the last command and QUIT is 2.2 seconds. So delayed quit does
increase the average session duration by almost 50%.

More interesting is the distribution:

After a DATA command, most clients (86%) send a QUIT after 0 or 1
seconds (our log files only have a time resolution of 1 second, so these
are probably the "send QUIT immediately" cases). If there is a peak at 2
seconds (postfix default) it is two small to be noticed. There is small
peak at 5 seconds (6%), and even smaller at 30 seconds (2.5%) and a
really tiny at 60 seconds (0.3%). The 

After a RSET it's a bit different: Only 76% send a QUIT after 0 or 1
seconds, 10% after 5 seconds and there seem to be small peaks of
decreasing size every 20 seconds.

If any other command was the last one, the only peak was at 0 seconds.

I haven't checked how long connections stay open which aren't correctly
terminated with QUIT (I probably should, there are a lot of them).

        hp

-- 
   _  | Peter J. Holzer    | Web 2.0 könnte man also auch übersetzen als
|_|_) | Sysadmin WSR       | "Netz der kleinen Geister".
| |   | hjp(_at_)hjp(_dot_)at         | 
__/   | http://www.hjp.at/ |  -- Oliver Cromm in desd

Attachment: signature.asc
Description: Digital signature

<Prev in Thread] Current Thread [Next in Thread>