Hi,
We have been working on bugs in fetchmail 5.9.0/5.9.8 the last weeks and we
are now able to give some results.
We are currently running fetchmail-5.9.0 and forwarding to QMAIL on HP-UX
11.0 with IPv4.
The bugs we are working on are:
1 - File descriptor leak on connect timeout.
2 - Fetchmail hanging on SSL_Connect.
The goal of this email is to explain how we tracked our problems and to
propose a fix for them.
The 2 diffs included ( driver.c and socket.c ) contain everything. They are
related to fetchmail-5.9.8.
1 - File descriptor leak on connect timeout.
a) Description
By using fetchmail 5.9.0 in our environment, we found a file descriptor leak
on connect timeout. We reproduced the same problem with fetchmail 5.9.8. To
reproduce it, we modified the SockOpen function of socket.c used for an IPv4
connection. We put an infinite loop just after the connect() call to
generate a connect timeout. The following diff shows exactly what we did.
--------------------------------------------------------------------------
*** socket.c.5.9.8.orig Mon Mar 11 16:47:11 2002
--- socket.c.5.9.8.test Tue Mar 12 18:08:43 2002
***************
*** 397,402 ****
--- 397,403 ----
errno = olderr;
return -1;
}
+ for(;;);
#ifndef HAVE_INET_ATON
}
#else
***************
*** 436,443 ****
}
ad.sin_port = htons(clientPort);
memcpy(&ad.sin_addr, *pptr, sizeof(struct in_addr));
! if (connect(sock, (struct sockaddr *) &ad, sizeof(ad)) == 0)
! break; /* success */
fm_close(sock); /* don't use SockClose, no traffic yet */
memset(&ad, 0, sizeof(ad));
ad.sin_family = AF_INET;
--- 437,446 ----
}
ad.sin_port = htons(clientPort);
memcpy(&ad.sin_addr, *pptr, sizeof(struct in_addr));
! if (connect(sock, (struct sockaddr *) &ad, sizeof(ad)) == 0) {
! for(;;);
! break; /* success */
! }
fm_close(sock); /* don't use SockClose, no traffic yet */
memset(&ad, 0, sizeof(ad));
ad.sin_family = AF_INET;
---------------------------------------------------------------------------
b) What the mailing list proposed
This bug has been reported and a fix has been proposed (Eric S.Raymond,
Novemver 08 2001). The goal of this fix is to block signals during the
critical region around the connect(2) call in SockOpen() and UnixOpen(). The
SockOpen() concerned is the one used for an IPv6 connection, i.e
"INET6_ENABLE".
c) Our fix
In the case of an IPv4 connection, SockOpen() is another function and the
fix doesn't handle the leak in this case. In the SockOpen() function called
for an IPv4 connection, the signals are not blocked. If a connect timeout
happens, setjmp() is called and the socket is never closed because the value
hasn't been saved.
To fix this problem, we propose :
- To save the value of the socket which is opened just before the connect
call in SockOpen() for an IPv4 connection. By saving it into a global
variable, we can assign this value to the main socket variable if setjmp()
is called when there is a connect timeout. Then the socket opened before the
connect() call will be freed.
- Not to block the signals In both SockOpen()(IPv6) and UnixOpen() functions
and to apply this fix. In case of a connect timeout, fetchmail would be
hanging if all the signals were blocked.
2 - Fetchmail hanging on SSL_Connect.
a) Description
In 5.9.0, we were having problem by using fetchmail in SSL over POP3 mode
for a particular Server. We found that fetchmail was sometimes hanging on
SSL_Connect function for this particular server. Even if it is a server
problem, fetchmail shouldn't hang. We reproduced the same problem with
fetchmail 5.9.8.
To reproduce it we put an infinite loop before the SSL_connect function call
in socket.c. The following diff shows exactly what we did:
-------------------------------------------------------
*** socket.c.5.9.8.orig Mon Mar 11 16:47:11 2002
--- socket.c.5.9.8.test Tue Mar 12 18:47:22 2002
***************
*** 925,931 ****
}
SSL_set_fd(_ssl_context[sock], sock);
!
if(SSL_connect(_ssl_context[sock]) == -1) {
ERR_print_errors_fp(stderr);
return(-1);
--- 925,932 ----
}
SSL_set_fd(_ssl_context[sock], sock);
!
! for(;;);
if(SSL_connect(_ssl_context[sock]) == -1) {
ERR_print_errors_fp(stderr);
return(-1);
-------------------------------------------------------
b) Our fix
In fetchmail 5.9.8, set_timeout(0) is called before SSL_Open in socket.c.
Then if fetchmail is hanging on SSL_connect(i.e SSL_Open), there is no way
to avoid it hanging. To fix this problem, we propose to call set_timeout(0)
after SSL_Open() call to keep a particular timeout and handle the problem if
Fetchmail is hanging on SSL_Open(). To avoid any file descriptor leak, the
socket opened before SSL_Open with SockOpen() will be freed in the same way
as the one in 1) File descriptor leak on connect timeout.
I attached the 2 diffs which fixed the 1) and 2) bugs for us.
Regards,
Sylvain Benoist.
diff_driver.5.9.8.c
Description: Binary data
diff_socket.5.9.8.c
Description: Binary data