This is an old problem I abandoned awhile ago because I could fine no solution and it only affected one server (so I just put my service somewhere else). This user has a problem with identical symptoms to mine:
- C# .net synchronous Tcp server
- a TcpClient object is assigned by blocking on a TcpListener with the AcceptTcpClient method
- once there’s a TcpClient object, I pass it to a thread that invokes the client’s GetStream method to create a NetworkStream
- this NetworkStream is looped over, in each iteration doing a networkStream.Read(someBuffer, 0, 4096)
- right now client and server are located on the same network, with no congestion to speak of
- my server has plenty of memory to spare
- if I load my server software onto another machine, the problem goes away
- the kicker: traffic from a network Linux box gets through fine and on time
The tcpClient.AcceptTcpClient() method blocks for around a minute at a time, resulting in the server having to read a huge block of bytes a short while later, instead of what it should do. It should do networkStream.Read() small blocks of bytes as frequently as they are sent (and the client sends them every 5s, not once a minute).
Previous comments to the other user suggest subpar networking or connectivity issues might be to blame, which at first seems reasonable. But this isn’t actually the case.
I went one step further and installed packet analyzers at both the client and server. Results:
- the instant the client sends one it shows up on the server’s analyzer
- network latency or connectivity are NOT the problem
- the packet/frame are arriving at server at the correct time
- somewhere between the network interface card that my analyzer is monitoring and my application something is causing this delay
- the .NET runtime is the only thing between my application and network interfacing
- some kind of socket error in .NET is the cause of this huge latency
Environment:
- in my specific case I’m using a Intel PRO/1000 MT Network card, and .NET
- Standard Edition Server 2003 R2, SP2
- .NET Frameworks installed: 2.0 SP2, 3.0 SP2, 3.5 SP1, 4 Client Profile, 4 Extended
If anyone has any advice I would very much like to know what it is.
This may be due to one of the following.
That model of network card has been observed to have trouble with TCP offload in other situations. You can disable this at the device driver configuration.
If it is a problem with offloaded handling of segmentation then you may find it only occurs on certain network routes which may explain your observed difference between your Linux client and your Windows client.
Example: http://forums.novell.com/novell-product-support-forums/netware/nw-other/communications/187741-offload-tcp-segmentation-intel-pro-1000-mt.html
Path MTU is supposed to be automatically discovered, but if an intervening router is dropping all ICMP traffic (including “needs fragmentation”) then you may see hanging connections. In your case the connection succeeds eventually so I don’t think this is your problem, but worth checking. (You can also reduce the MTU and alter the MTU discovery algorithm if necessary, but you should probably leave this alone unless this is your issue and you can’t fix the router.)
If the windows machine is in a domain it may be attempting and failing to set up an IPSec relationship. This will depend on the configuration of both the client and the server. Normally this would fail quickly, but if you are blocking some IPSec traffic, you may see it failing slowly. Look for IKE and IPSec traffic in your network analyser.