I currently work on a multithreaded application where I receive data using the following (simplified) code:
private void BeginReceiveCallback(IAsyncResult ar)
{
bytesReceived = this.Socket.EndReceive(ar);
byte[] receivedData = new byte[bytesReceived];
Array.Copy(buffer, receivedData, bytesReceived);
long protocolLength = BitConverter.ToInt64(receivedData, 0);
string protocol = Encoding.ASCII.GetString(receivedData, 8, (int)protocolLength);
IList<object> sentObjects =
ParseObjectsFromNetworkStream(receivedData, 8 + protocolLength);
InvokeDataReceived(protocol, sentObjects);
}
I’m experiencing that receivedData contains not only the expected data, but also a lot more. I suspect that this is data sent afterwards that has been mixed in with the previous in the stream.
My question is, what data can I expect to be stored in this buffer. Can it contain data from two different send operations from the client side? In this case, then I suppose I will have to come up with a protocol that can differentiate between the data ‘messages’ sent from the client side. A simple approach would be to respectively start and end each stream with a specific (unique) byte. Is there a common approach to seperating messages? Furthermore I guess this means that a single receive call might not be enough to get all the data from the client which means I’ll have to loop until the end byte was found?
TCP/IP socket connections consist of two independent streams: one incoming and one outgoing.
This is one of the key concepts of TCP/IP that is often missed. From the perspective of the application, TCP/IP does not operate on packets; it operates on streams!
There is no method to send a packet. The API simply does not exist. When you send data, you just place those bytes in the outgoing stream. They are then read from the incoming stream on the other side.
As an example, one side can send 5 bytes and then send another 5 bytes. The receiving side can receive two batches of 5 bytes, or one at a time, or all 10 in a single read…
To split the incoming stream of bytes into messages, you need message framing. One of two solutions is commonly used. The one you suggested is the delimiter solution, where SOT/EOT bytes are used to designate message boundaries. Another one (which I prefer) is the length prefix solution, where the length of the message is prefixed to the message itself.
A more thorough discussion is on my blog, along with sample code for length prefixing.