I’m having a dead-end situation with one of the clients using my software. Out of about 40 copies of our product sold (Application programmed in .NET 2.0 using VB.NET 2005), about 2 get non-responsive with 1 core of the dual core CPUs stuck at 100% (program uses 1 core only)
The most logical guess is an infinite loop causing this behavior, but the are thousands of lines of code with many, many loops. That is all the information I’ve got; now, how do you suggest I approach debugging this problem?
EDIT:
Basically, the software is responsible for calculating amount of credit spent using other devices, such as PCs, etc. It is a Cybercafe management program and fails intermittently i.e. it is subtracting credit when is fails. It does other things in the background too, like checking to see if it is time to create a database backup, among other things.
EDIT:
Solved. It was the most unlikely problem. The Access Database Engine which I used as the DBMS is actually the part of my application that is problematic. It has difficulty working with a row-JUST ONE FRIGGIN ROW-in one of the tables. I can’t delete it, or otherwise add a record related to that row in any other table; Even MS Access 2007 causes the CPU to go up to 100% when I try to work with that row!
A simple “Compact and Repair” command fixed everything. I guess I’ll issue that command every time my application starts up. That would prevent this from happening again.
Thanks to WinDbg I could find where the problem was. I recommend everyone to learn how to use it ’cause it’s a real time saver.
Install windbg (Windows debugger) on the target machine. Invoke the debugger, and attach to the suspicious process, run the program and then wait until problem happens. When the problem happens, invoke the following command in the debugger command line
!runaway
This will show which of your threads are consuming most of the time. Then get several thread stacks from that thread that is consuming most of your cpu resources.
Here is an example:
User Mode Time
Thread Time
0:1074 0 days 0:00:21.637
11:137c 0 days 0:00:02.792
4:12c8 0 days 0:00:00.530
9:1374 0 days 0:00:00.046
15:13d0 0 days 0:00:00.000
14:1204 0 days 0:00:00.000
13:154c 0 days 0:00:00.000
12:144c 0 days 0:00:00.000
10:1378 0 days 0:00:00.000
8:1340 0 days 0:00:00.000
7:12f0 0 days 0:00:00.000
6:12d4 0 days 0:00:00.000
5:12d0 0 days 0:00:00.000
3:12c4 0 days 0:00:00.000
2:12c0 0 days 0:00:00.000
1:12b4 0 days 0:00:00.000
Now assume we want a call stack for the second thread in the list, thread 11, so we first switch to thread 11. This can be done by entering ~11s.
eax=03fbb270 ebx=ffffffff ecx=00000002 edx=00000060 esi=00000000 edi=00000000
eip=77475e74 esp=0572f60c ebp=0572f67c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
77475e74 c3 ret
Now get a call stack for this thread by executing kp:
The command kp will print the parameters. Local variables can be printed with dv.
Alternatively you can use process explorer from sysinternals.
If all this is not possible, because it is a remote client machine, install userdump, which creates a dump file that can be sent to you for further analysis. You can create a batch file for the customer to invoke userdump with the correct parameters. Userdump is a tool from Microsoft, which can be downloaded from their web page.