I seem to be having a problem with my sockets. Below, you will see some code which forks a server and a client. The server opens a TCP socket, and the client connects to it and then closes it. Sleeps are used to coordinate the timing. After the client-side close(), the server tries to write() to its own end of the TCP connection. According to the write(2) man page, this should give me a SIGPIPE and an EPIPE errno. However, I don’t see this. From the server’s point of view, the write to a local, closed socket succeeds, and absent the EPIPE I can’t see how the server should be detecting that the client has closed the socket.
In the gap between the client closing its end and the server attempting to write, a call to netstat will show that the connection is in a CLOSE_WAIT/FIN_WAIT2 state, so the server end should definitely be able to reject the write.
For reference, I’m on Debian Squeeze, uname -r is 2.6.39-bpo.2-amd64.
What’s going on here?
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/socket.h>
#include <sys/select.h>
#include <netinet/tcp.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <fcntl.h>
#include <netdb.h>
#define SERVER_ADDRESS "127.0.0.7"
#define SERVER_PORT 4777
#define myfail_if( test, msg ) do { if((test)){ fprintf(stderr, msg "\n"); exit(1); } } while (0)
#define myfail_unless( test, msg ) myfail_if( !(test), msg )
int connect_client( char *addr, int actual_port )
{
int client_fd;
struct addrinfo hint;
struct addrinfo *ailist, *aip;
memset( &hint, '\0', sizeof( struct addrinfo ) );
hint.ai_socktype = SOCK_STREAM;
myfail_if( getaddrinfo( addr, NULL, &hint, &ailist ) != 0, "getaddrinfo failed." );
int connected = 0;
for( aip = ailist; aip; aip = aip->ai_next ) {
((struct sockaddr_in *)aip->ai_addr)->sin_port = htons( actual_port );
client_fd = socket( aip->ai_family, aip->ai_socktype, aip->ai_protocol );
if( client_fd == -1) { continue; }
if( connect( client_fd, aip->ai_addr, aip->ai_addrlen) == 0 ) {
connected = 1;
break;
}
close( client_fd );
}
freeaddrinfo( ailist );
myfail_unless( connected, "Didn't connect." );
return client_fd;
}
void client(){
sleep(1);
int client_fd = connect_client( SERVER_ADDRESS, SERVER_PORT );
printf("Client closing its fd... ");
myfail_unless( 0 == close( client_fd ), "close failed" );
fprintf(stdout, "Client exiting.\n");
exit(0);
}
int init_server( struct sockaddr * saddr, socklen_t saddr_len )
{
int sock_fd;
sock_fd = socket( saddr->sa_family, SOCK_STREAM, 0 );
if ( sock_fd < 0 ){
return sock_fd;
}
myfail_unless( bind( sock_fd, saddr, saddr_len ) == 0, "Failed to bind." );
return sock_fd;
}
int start_server( const char * addr, int port )
{
struct addrinfo *ailist, *aip;
struct addrinfo hint;
int sock_fd;
memset( &hint, '\0', sizeof( struct addrinfo ) );
hint.ai_socktype = SOCK_STREAM;
myfail_if( getaddrinfo( addr, NULL, &hint, &ailist ) != 0, "getaddrinfo failed." );
for( aip = ailist; aip; aip = aip->ai_next ){
((struct sockaddr_in *)aip->ai_addr)->sin_port = htons( port );
sock_fd = init_server( aip->ai_addr, aip->ai_addrlen );
if ( sock_fd > 0 ){
break;
}
}
freeaddrinfo( aip );
myfail_unless( listen( sock_fd, 2 ) == 0, "Failed to listen" );
return sock_fd;
}
int server_accept( int server_fd )
{
printf("Accepting\n");
int client_fd = accept( server_fd, NULL, NULL );
myfail_unless( client_fd > 0, "Failed to accept" );
return client_fd;
}
void server() {
int server_fd = start_server(SERVER_ADDRESS, SERVER_PORT);
int client_fd = server_accept( server_fd );
printf("Server sleeping\n");
sleep(60);
printf( "Errno before: %s\n", strerror( errno ) );
printf( "Write result: %d\n", write( client_fd, "123", 3 ) );
printf( "Errno after: %s\n", strerror( errno ) );
close( client_fd );
}
int main(void){
pid_t clientpid;
pid_t serverpid;
clientpid = fork();
if ( clientpid == 0 ) {
client();
} else {
serverpid = fork();
if ( serverpid == 0 ) {
server();
}
else {
int clientstatus;
int serverstatus;
waitpid( clientpid, &clientstatus, 0 );
waitpid( serverpid, &serverstatus, 0 );
printf( "Client status is %d, server status is %d\n",
clientstatus, serverstatus );
}
}
return 0;
}
This is what the Linux man page says about
writeandEPIPE:When Linux is using a
pipeor asocketpair, it can and will check the reading end of the pair, as these two programs would demonstrate:Linux is able to do so, because the kernel has innate knowledge about the other end of the pipe or connected pair. However, when using
connect, the state about the socket is maintained by the protocol stack. Your test demonstrates this behavior, but below is a program that does it all in a single thread, similar to the two tests above:If you run the above program, you will get output similar to this:
This shows it took one
writefor the sockets to transition to theCLOSEDstates. To find out why this occurred, a TCP dump of the transaction can be useful:The first three lines represent the 3-way handshake. The fourth line is the
FINpacket the client sends to the server, and the fifth line is theACKfrom the server, acknowledging receipt. The sixth line is the server trying to send 1 byte of data to the client with thePUSHflag set. The final line is the clientRESETpacket, which causes the TCP state for the connection to be freed, and is why the thirdnetstatcommand did not result in any output in the test above.So, the server doesn’t know the client will reset the connection until after it tries to send some data to it. The reason for the reset is because the client called
close, instead of something else.The server cannot know for certain what system call the client has actually issued, it can only follow the TCP state. For example, we could replace the
closecall with a call toshutdowninstead.The difference between
shutdownandcloseis thatshutdownonly governs the state of the connection, whileclosealso governs the state of the file descriptor that represents the socket. Ashutdownwill notclosea socket.The output will be different with the
shutdownchange:The TCP dump will show also show something different:
Notice the reset at the end comes 5 seconds after the last
ACKpacket. This reset is due to the program shutting down without properly closing the sockets. It is theACKpacket from the client to the server before the reset that is different than before. This is the indication that the client did not useclose. In TCP, theFINindication is really an indication that there is no more data to be sent. But since a TCP connection is bi-directional, the server that receives theFINassumes the client can still receive data. In the case above, the client in fact does accept the data.Whether the client uses
closeorSHUT_WRto issue aFIN, in either case you can detect the arrival of theFINby polling on the server socket for a readable event. If after callingreadthe result is0, then you know theFINhas arrived, and you can do what you wish with that information.Now, it is trivially true that if the server issues
SHUT_WRwithshutdownbefore it tries to do a write, it will in fact get theEPIPEerror.If, instead, you want the client to indicate an immediate reset to the server, you can force that to happen on most TCP stacks by enabling the linger option, with a linger timeout of
0prior to callingclose.With the above change, the output of the program becomes:
The
sendgets an immediate error in this case, but it is notEPIPE, it isECONNRESET. The TCP dump reflects this as well:The
RESETpacket comes right after the 3-way handshake completes. However, using this option has its dangers. If the other end has unread data in the socket buffer when theRESETarrives, that data will be purged, causing the data to be lost. Forcing aRESETto be sent is usually used in request/response style protocols. The sender of the request can know there can be no data lost when it receives the entire response to its request. Then, it is safe for the request sender to force aRESETto be sent on the connection.