We’re having a strange cpu issue with WCF services hosted on IIS 6.0 on Windows Server 2003 SP2 x64. On some of our environments, seemingly when an app pool starts up, the svchost.exe running iissvcs will spin up to 100% cpu and stay there indefinitely (3+ days observed).
Investigation tidbits:
- It doesn’t seem to be any specific app pool (we’re hosting 5+ WCF
services in each their own app pool) nor happen every time. - The w3wp.exe process is started up, but using almost no memory / cpu – looks like very early initialization.
- It seems strange that it’s actually svchost.exe and not w3wp.exe that’s 100%, so I’m suspecting a configuration issue. It seems to never touch our code.
- We have 3 environments that have the issue and 1 that doesn’t. Unfortunately the environments are installed manually, so they’re not entirely identical.
- An environment seems only to become ‘infected’ by one of our two branches. But after seeing the issue first time, even if the other branch is deployed, the issue remains.
- The process seems to be doing no I/O, neither disk nor network.
- Anti-virus software has been disabled from the one of the servers.
I’ve spent countless hours searching for the issue, I havn’t been able to find any related issues on SO / google.
Investigating dumps with WinDbg says that the active thread stack trace is:
00000000000afaa8 0000000077d6e4a6 ntdll!NtReadFile+0xa
00000000000afab0 000007ff7fefe89e kernel32!ReadFile+0x1e0
00000000000afb50 000007ff7fefe7cd advapi32!ScGetPipeInput+0x3e
00000000000afbc0 000007ff7fee4ec9 advapi32!ScDispatcherLoop+0xa0
00000000000afca0 0000000100002b29 advapi32!StartServiceCtrlDispatcherW+0x119
00000000000aff10 00000001000029be svchost!wmainCRTStartup+0x18a
00000000000aff50 0000000077d596ac svchost!wmainCRTStartup+0xe
00000000000aff80 0000000000000000 kernel32!BaseProcessStart+0x29
However, dumping a working svchost.exe gives the same result.
Stack traces of all threads:
- faulted svchost.exe: http://pastebin.com/wQwYykbd
- working svchost.exe: http://pastebin.com/Fw1UnuRE
I wanted to include the traces here, but since they’re rather long, I’ve put them on pastebin for the time being.
Any insight into what could be causing this – or ways to investigate further – is most welcome.
A couple of points:
This thread is encountering an exception while it is trying to allocate memory:
Thread stack 13, 16, 17 have the same problem. This looks like a heap corruption, but without the dump file it is difficult to verify. Somewhere the dynamic memory management code of the app has bugs.
There have been several bugs with iiws3adm, so it would be best to patch the machines to the most current level. Otherwise the next step is to check for heap corruptions.