I have a .Net 2.0 application written in C# that monitors other Windows XP Computers on a local LAN. On some systems, after a long uptime (40 to 120 days) the .Net Ping can fail. Windows command-prompt ping still succeeds.
Once this failure has occurred, it appears that all .Net Pings fail. A separate .Net application using similar code also fails.
Here is a sample of the code:
internal static bool canPingHost(string host)
{
bool success = false;
const int PING_TIMEOUT_MS = 1000;
try
{
using (Ping p = new Ping())
{
PingReply pr = p.Send(host, PING_TIMEOUT_MS);
if (pr.Status == IPStatus.Success)
{
success = true;
}
}
}
catch
{
}
return success;
}
Key points about the setup for this issue:
- All PCs are plugged in to the same unmanaged switch
- All other PCs can use the same .Net Ping to talk to the problem system.
- Windows ping works correctly on the problem system.
- Any .Net 2.0 application tried on the on problem system fails.
- Database operations to and from the problem system also work (TCP connection)
- Stopping and starting the application does not fix the issue on the problem system.
When this system fails, I’ve run another application with further debugging information.
static string doping(IPAddress IP)
{
int PING_TIMEOUT_MS = 3000;
string rv = IP.ToString();
using (Ping p = new Ping())
{
bool success = false;
PingReply pr = null;
try
{
pr = p.Send(IP, PING_TIMEOUT_MS);
success = pr.Status == IPStatus.Success;
}
catch (Exception ex)
{
rv = rv +" [ " +ex.Message + " ] ";
}
if (pr != null)
{
if (success)
{
rv = rv + " yes " + pr.RoundtripTime.ToString();
}
else
{
rv = rv + " no " + pr.Status.ToString();
}
}
else
{
rv = rv + " no (fail) ";
}
}
return rv;
}
The output from the program is 192.168.0.2 no 1450.
The PingReply Status variable returns 1450 which does not appear to be defined in the IPStatus (PingReply.Status) enum.
After restarting the problem computer, .Net Ping starts to work correctly again
It looks like there is a resource problem of some description. I’m not sure which resource it could be.
I have read about issues with asynchronous Pings and .Net 2.0. This is a synchronous ping and as far as I can tell it is not affected.
I’m looking for:
- Prevention of the problem in the first place
- Suggestions to debug the remote system once it fails (production system, Windows XP SP3, no developer tools installed)
- Monitoring resources to determine which one is failing.
Caveats:
- Rebooting the problem system on a regular basis is not currently an option.
- Upgraded to the latest version of the .Net Framework is not currently an option.
- Changing the software to no longer use .Net Ping is an option but I would still like to know what is going on.
There was another application running on the same system that was consuming vast quantities (> 100 MB) of Pool nonpaged Bytes. This happened over a period of several weeks.
This was determined by looking at the Performance counters for the entire system. The ones that stood out were:
Pool Nonpaged bytes stood out as a very large number, all concentrated in one application
Restarting the other application caused the Non-Paged Pool memory usage to return to a much lower value. Ping then completed successfully. The lack of Non Paged Pool memory appeared to cause the Ping failure.
Information on the Pool Nonpaged bytes can be found on the page Pushing the Limits of Windows: Paged and Nonpaged Pool. When the system runs out of Nonpaged bytes many resource allocations can be denied leading to an exceedingly unstable system.
Very unusual issue. The resolution did not have anything directly involved with the Ping source.