I'm not sure if this message reached the lists... I'm not a subscriber
of ietf-smtp, may be I should?
However, my comments, is that I had a look at "SMTP Service Extension
for Checkpoint/Restart" RFC 1845, which is Experimental. My guess is
that this RFC would need a review and may be moved to the standard
track...
I would like comments on the way I though about it, and the way it is
presented, what do you think of the 2 methods?
in RFC 1845
The server send CHECKPOINT to the EHLO command, then the client in the
MAIL FROM adds TRANSID with a transaction ID, which the server answers
with ok or with a number of byte to resume from.
In what I envisaged
The server send RESUME to the EHLO command, then the client before DATA
send the RESUME command (with or without a session ID) which the server
answer by a SESSION with a session ID and a number of Bytes to resume
from (or 0 to start from the beginning again).
I feel in the last system letting the server give the session ID to the
client ensure that the session ID is unique on the server, and that the
client learn about the session ID only if it has tried before to send
the message.
So let me know where to go from there...
Cheers
franck(_at_)sopac(_dot_)org
--- Begin Message ---
At 8:09 PM -0400 5/8/01, Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu wrote:
Am taking further followups to ietf-smtp(_at_)imc(_dot_)org where it belongs...
Seems reasonable, but I'm not subscribed, so I won't forward to that
list. Please feel free to forward any extracts you think relevant.
Is there any evidence that it is common enough to have a network outage severe
enough that the TCP connection gets broken, but that within 5 minutes it's
resumable?
Well, I do some contracting in Papua New Guinea. The whole country is
fed by a 4M satellite circuit, which is usually right up on the 100%
utilisation mark. Large email messages take a long time to get
through.
Now, being in tropical latitudes, there are often rainstorms up here.
BIG rainstorms. Rainstorms so big that occasionally even the 17m
satellite dishes can no longer receive traffic.
In the wet season, one sometimes gets 10 or so outages of 2-5 minutes
each over a period of a few hours. Most TCP sessions will drop if
they're trying to push traffic unsuccessfully for over two minutes.
(You can sometimes get similar outages with a smaller window over
long-haul microwave networks around sunrise/sunset.)
Given that Franck is in Fiji, I expect he sees similar issues.
Well.. I still think that the *proper* solution is to fix the TCP
infrastructure
so that there isn't a NEED for a RESUME....
No fibre across the sea floor so physical infrastructure can't be
fixed short term, and realistically I think that in the face of no
other information TCP is doing the right thing in dropping the
connection.
Having an upstream mail server listed in MX's with parameters tuned
to the link would help, except that most implementations only try
higher MX's if the connection fails, not if the session fails, and
will keep reconnecting to the lower MX'd server. This is usually
solved with access lists preventing the outside from connecting
directly across the satellite, forcing connectivity to the server
tuned (larger TCP disconnect timers, among others) to forward
correctly over the link.
--
Andrew Rutherford andrewr(_at_)iagu(_dot_)net http://www.iagu.net/ +61 414
313 767
Iagu Networks PO Box 256 Rundle Mall SA 5000 Australia
--- End Message ---