I am developing my own protocol over UDP (under Linux) for a cache application (similar to memcached) which only executes INSERT/READ/UPDATE/DELETE operations on an object and I am not sure which design would be the best:
- Send one request per packet. (client prepares the request and sends it to the server immediately)
- Send multiple requests per packet. (client enqueues the requests in a packet and when it is full (close to the MTU size) sends it to the server)
The size of the request (i.e. the record data) can be from 32 bytes to 1400 bytes, I don’t know which will it be on average, it entirely depends on the user’s application.
-
If choose single request per packet, I will have to manage a lot of small packets and the kernel will be interruped a lot of times. This will slow the operation since the kernel must save registers when switching from user space to system. Also there will be overhead in data transmition, if user’s application sends many requests of 32 bytes (the packet overhead for udp is about 28 bytes) network traffic will double and I will have big impact on transmission speed. However high network traffic not necessarily implies low performance since the NIC has its own processor and does not makes the CPU stall. Additional network card can be installed in case of a network bottleneck.
The big advantage for using single packet is that the server and client will be so simple that I will save on instructions and gain on speed, at the same time I will have less bugs and the project will be finished earlier. -
If I use multiple requests per packet, I will have fewer but bigger packets and therefore more data could be transmitted over the network. I will have reduced number of system calls but the complexity of the server will require more memory and more instructions to be executed so it is unknown if we get faster execution doing it this way. It may happen that the CPU will be the bottleneck, but what is cheaper, to add a CPU or a network card?
The application should have heavy data load, like 100,000 requests per second on lastest CPUs. I am not sure which way to do it. I am thinking to go for ‘single request per packet’, but before I rewrite all the code I already wrote for multiple request handling I would like to ask for recommendations.
Thanks in advance.
What do you care about more: latency or bandwidth?
NOTE: The network, not the CPU, will likely be your major bottleneck in either case, unless you are running over an extremely fast network. And even if you do, the INSERT/READ/UPDATE/DELETE in the database will likely spend more CPU and I/O than the CPU work needed for packets.