I have been doing socket programming for many years, but I have never had a missed message using TCP – until now. I have a java server and a client in C – both on the localhost. They are sending short message back and forth as strings, with some delays in between. I have one particular case where a message never arrives on the client side. It is reproducible, but oddly machine dependent.
To give some more details, I can debug the server side and see the send followed by the flush. I can attach to the client and walk through the select calls (in a loop) but it simply never shows up. Has anyone experienced this and is there an explanation other than a coding error?
In other words, if you have a connected socket and do a write on one side and a read on the other, what can happen in the middle to cause something like this?
One other detail – I’ve used tcpdump on the loopback interface and can see the missed message.
Finally – after sniffing some more, I found the problem. Two messages were getting sent before a read (sometimes, but rarely…) so they were both read, but only the first was handled. This is why it seemed as though the second message never arrived. It was buried in the receive buffer.