I’m writing a C program composed of one dispatcher thread and N worker thread, the responsible of which are described below:
dispatcher thread:
listen on a TCP port;
do epoll_wait() repeatedly on that port;
when connection established, accept it and pass the new file descriptor(i.e. what the “accept” function return) to one of the N worker thread;
worker thread:
upon each new connection, do read repeatedly until no data received;
using all the data received as parameter to call the decode function which will decode the data to a message structure (i.e. an RTSP message);
what I wonder is that, if the data that worker thread read is incomplete, should I cache it which means that I should maintain a global list to cache the unused data(i.e. received but not of full message, so not used yet) for each connection?
If you use a worker per socket I guess there is no problem, you just block until you get all the message.. I’m assuming this is not your case.
If you use a worker for handling several sockets in a non-blocking manner, you could use this approach:
Start reading the data in a pre-determined buffer size. (Try to match the size of the buffer to the maximum possible length of the message, this will save you copies).
Determine the total message length (from the header of your protocol) and calculate how much you need to continue reading to finish the whole message. In this case, you may have already read “too much”, so you should allocate another buffer for the “next” message, and if you want to be more generic, you could keep n such buffers (based on the minimal message length and the assigned buffer to read).
You could also choose to always read only the header and continue from there (this will make sure you do not read too much), but it will be more wasteful (you need two reads per each message).
If the message is fully read, process it, otherwise, keep the buffer and the amount of bytes to read for this message and loop again through the sockets (your epool).
On your next handling of the same socket, you will check if you currently have a partial message and continue reading into the same buffer from the location you finished the last time. You need to read here the next x bytes, and you need to be prepared to have less than what you expect.
Here you could add also an optimization, reading all that you have (left in the buffer) on this socket in one shot (not only the next x bytes left, saving you some system calls). If you do that, you’ll need to use vectors (readv() or similar).
If you go without the optimization stuff, it is pretty simple to handle.