I am wondering if there is a way to avoid having a TCP RST flag set as opposed to a TCP FIN flag when closing a connection in Netty where there is input data remaining in the TCP receive buffer.
The use case is:
- Client (written in C) sends data packets containing many fields.
- Server reads packets, encounters an error on an early field, throws an exception.
- Exception handler catches the exception, writes an error message, and adds the close on write callback to the write future.
The problem is:
Remaining data in the receive buffer causes Linux (or Java..) to flag the TCP packets with the RST flag. This prevents the client from reading the data since when it gets around to trying it finds it has a read error due to the socket being closed.
With a straight Java socket, I believe the solution would be to call socket.shutdownOutput() before closing. Is there an equivalent function in Netty or way around this?
If I simply continue reading from the socket, it may not be enough to avoid the RST since there may or may not be data in the buffer exactly when close is called.
For reference: http://cs.baylor.edu/~donahoo/practical/CSockets/TCPRST.pdf
UPDATE:
Another reference and description of the problem: http://docs.oracle.com/javase/1.5.0/docs/guide/net/articles/connection_release.html
Calling shutdownOutput() should help with a more orderly closing of the connection (by sending a FIN), but if the client is still sending data then RST messages will be sent regardless (see answer from EJP. A shutdownOutput() equivalent may be available in Netty 4+.
Solutions are either to read all data from the client (but you can never be sure when the client will fully stop sending, especially in the case of a malicious client), or to simply wait before closing the connection after sending the response (see answer from irreputable).
Can you try this: after server writes the error message, wait for 500ms, then close(). See if the client can receive the error message now.
I’m guessing that the packets in the server receive buffer have not been ACK-ed, due to TCP delayed acknowledgement. If close() is called now, the proper response for these packets is RST. But if shutdownOutput() is invoked, it’s a graceful close process; the packets are ACK-ed first.
EDIT: another attempt after learning more about the matter:
The application protocol is, the server can respond anytime, even while the client request is still being streamed. Therefore the client should, assuming blocking mode, have a separate thread reading from server. As soon as the client reads a response from server, it needs to barge into the writing thread, to stop further writing to the server. This can be done by simply close() the socket.
On the server side, if the response is written before all request data are read, and close() is called afterwards, most likely RST will be sent to client. Apparently most TCP stacks send RST to the other end if close() is called when the receive buffer isn’t empty. Even if the TCP stack doesn’t do that, very likely more data will arrive immediately after close(), triggering RST anyway.
When that happens, the client will very likely fail to read the server response, hence the problem.
So the server can’t immediately close() after response, it needs to wait till client receives the response. How does the server know that?
First, how does the client know that it has received the full response? That is, how is the response terminated? If response is terminated by TCP FIN, the server must send FIN after response, by calling shutdownOutput(). If the response is self-terminated, e.g. by HTTP Content-Length header, the server needs not to call shutdownOutput().
After the client receives the full response, per protocol, it should promptly quit sending more data to the server. This is done by crudely sever the connection; the protocol didn’t design a more elegant way. Either FIN or RST is fine.
So the server, after writing the response, should keep reading from the client, till EOF or error. Then it can close() the socket.
However, there should be a timeout for this step, to account for malicious/broken clients and network problems. Several seconds should be sufficient to complete the step in most cases.
Also, the server may not want to read from the client, since it isn’t free. The server can simply wait past the timeout, then close().