I am working on a LAN based solution with a “server” that has to control a number of “players”
My protocol of choice is UDP because its easy, I do not need connections, my traffic consists only of short commands from time to time and I want to use a mix of broadcast messages for syncing and single target messages for player individual commands.
Multicast TCP would be an alternative, but its more complicated, not exactly suited for the task and often not well supported by hardware.
Unfortunately I am running into a strange problem:
The first datagram which is sent to a specific ip using “sendto” is lost.
Any datagram sent short time afterwards to the same ip is received.
But if i wait some time (a few minutes) the first “sendto” is lost again.
Broadcast datagrams always work.
Local sends (to the same computer) always work.
I presume the operating system or the router/switch has some translation table from IP to MAC addresses which gets forgotten when not being used for some minutes and that unfortunately causes datagrams to be lost.
I could observe that behaviour with different router/switch hardware, so my suspect is the windows networking layer.
I know that UDP is by definition “unreliable” but I cannot believe that this goes so far that even if the physical connection is working and everything is well defined packets can get lost. Then it would be literally worthless.
Technically I am opening an UDP Socket,
bind it to a port and INADRR_ANY.
Then I am using “sendto” and “recvfrom”.
I never do a connect – I dont want to because I have several players. As far as I know UDP should work without connect.
My current workaround is that I regularly send dummy datagrams to all specific player ips – that solves the problem but its somehow “unsatisfying”
Question: Does anybody know that problem? Where does it come from? How can I solve it?
Edit:
I boiled it down to the following test program:
int _tmain(int argc, _TCHAR* argv[])
{
WSADATA wsaData;
WSAStartup(MAKEWORD(2, 2), &wsaData);
SOCKET Sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
SOCKADDR_IN Local = {0};
Local.sin_family = AF_INET;
Local.sin_addr.S_un.S_addr = htonl(INADDR_ANY);
Local.sin_port = htons(1234);
bind(Sock, (SOCKADDR*)&Local, sizeof(Local));
printf("Press any key to send...\n");
int Ret, i = 0;
char Buf[4096];
SOCKADDR_IN Remote = {0};
Remote.sin_family = AF_INET;
Remote.sin_addr.S_un.S_addr = inet_addr("192.168.1.12"); // Replace this with a valid LAN IP which is not the hosts one
Remote.sin_port = htons(1235);
while(true) {
_getch();
sprintf(Buf, "ping %d", ++i);
printf("Multiple sending \"%s\"\n", Buf);
// Ret = connect(Sock, (SOCKADDR*)&Remote, sizeof(Remote));
// if (Ret == SOCKET_ERROR) printf("Connect Error!\n", Buf);
Ret = sendto(Sock, Buf, strlen(Buf), 0, (SOCKADDR*)&Remote, sizeof(Remote));
if (Ret != strlen(Buf)) printf("Send Error!\n", Buf);
Ret = sendto(Sock, Buf, strlen(Buf), 0, (SOCKADDR*)&Remote, sizeof(Remote));
if (Ret != strlen(Buf)) printf("Send Error!\n", Buf);
Ret = sendto(Sock, Buf, strlen(Buf), 0, (SOCKADDR*)&Remote, sizeof(Remote));
if (Ret != strlen(Buf)) printf("Send Error!\n", Buf);
}
return 0;
The Program opens an UDP Socket, and sends 3 datagrams in a row on every keystroke to a specific IP.
Run that whith wireshark observing your UDP traffic, press a key, wait a while and press a key again.
You do not need a receiver on the remote IP, makes no difference, except you wont get the black marked “not reachable” packets.
This is what you get:

As you can see the first sending initiated a ARP search for the IP. While that search was pending the first 2 of the 3 successive sends were lost.
The second keystroke (after the IP search was complete) properly sent 3 messages.
You may now repeat sending messages and it will work until you wait (about a minute until the adress translation gets lost again) then you will see dropouts again.
That means: There is no send buffer when sending UDP messages and there are ARP requests pending! All messages get lost except the last one.
Also “sendto” does not block until it successfully delivered, and there is no error return!
Well, that surprises me and makes me a little bit sad, because it means that I have to live with my current workaround or implement an ACK system that only sends one message at a time and then waits for reply – which would not be easy any more and imply many difficulties.
I’m posting this long after it’s been answered by others, but it’s directly related.
Winsock drops UDP packets if there’s no ARP entry for the destination address (or the gateway for the destination).
Thus it’s quite likely some of the first UDP packet gets dropped as at that time there’s no ARP entry – and unlike most other operating systems, winsock only queues 1 packet while the the ARP request completes.
This is documented here:
The same behavior can be observed on Mac OS X and FreeBSD: