I have an Azure web role that accesses an external WCF based SOAP web service (port 80) for various bits of data. The response from this service is highly erratic. I routinely get the following error.
There was no endpoint listening at
http://www.myexternalservice.com/service.svc that could accept the message. This is
often caused by an incorrect address or SOAP action.
To isolate the problem I created a simple console app to repetitively call this service in 1 second intervals and log all responses.
using (var svc = new MyExternalService())
{
stopwatch.Start();
var response = svc.CallService();
stopwatch.Stop();
Log(response, stopwatch.ElapsedMilliseconds);
}
If I RDP to one of my Azure web instances and run this app it takes 10 to 20 attempts before it gets a valid response from the external service. These first attempts are always accompanied by the above error. After this “warm up period” it runs fine. If I stop the app and then immediately restart, it has to go back through the same “warm up” period.
However, if I run this same app from any other machine I receive valid responses immediately. I have run this logger app on servers running in multiple data centers (non Azure), desktops on different networks, etc… These test runs are always very stable.
I am not sure why this service would react this way in the Azure environment. Unfortunately, for the short term I am forced to call this service but my users cannot tolerate this inconsistency.
A capture of network traffic on the Azure server indicates a large number of SynReTransmit‘s in 10 second intervals during the same time I experience the connection errors. Once the “warm up” is complete the SynReTransmit’s no longer occur.
We found a solution for this problem although I am not completely happy with it. After exhausting all other courses of action we changed the load balancer to Layer-7 Load Balancing from Layer-4 Load Balancing. While this fixed the problem of lost requests I am not sure why this made a difference.