I am trying to debug some work that processes large files. The code itself works, but there are sporadic errors reported from the .NET Runtime itself. For context, the processing here is a 1.5GB file (loaded into memory once only) being processed and released in a loop, deliberately to try to reproduce this otherwise unpredictable error.
My test fragment is basically:
try {
byte[] data =File.ReadAllBytes(path);
for(int i = 0 ; i < 500 ; i++)
{
ProcessTheData(data); // deserialize and validate
// force collection, for tidiness
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
}
} catch(Exception ex) {
Console.WriteLine(ex.Message);
// some more logging; StackTrace, recursive InnerException, etc
}
(with some timing and other stuff thrown in)
The loop will process fine for an non-deterministic number of iterations fully successfully – no problems whatsoever; then the process will terminate abruptly. The exception handler is not hit. The test does involve a lot of memory use, but it saw-tooths very nicely during each iteration (there is not an obvious memory leak, and I have plenty of headroom – 14GB unused primary memory at the worst point in the saw-tooth). The process is 64-bit.
The windows error-log contains 3 new entries, which (via exit code 80131506) suggest an Execution Engine error – a nasty little critter. A related answer, suggests a GC error, with a “fix” to disable concurrent GC; however this “fix” does not prevent the issue.
Clarification: this low-level error does not hit the CurrentDomain.UnhandledException event.
Clarification: the GC.Collect is there only to monitor the saw-toothing memory, to check for memory leaks and to keep things predictable; removing it does not make the problem go away: it just makes it keep more memory between iterations, and makes the dmp files bigger ;p
By adding more console tracing, I have observed it faulting during each of:
- during deserialization (lots of allocations, etc)
- during GC (between a GC “approach” and a GC “complete”, using the GC notification API)
- during validation (just
foreachover some of the data) – curiously just after a GC “complete” during the validation
So lots of different scenarios.
I can obtain crash-dump (dmp) files; how can I investigate this further, to see what the system is doing when it fails so spectacularly?
If you have memory dumps, I’d suggest using WinDbg to look at them, assuming that you’re not doing that already.
Trying running the comment
!EEStack(mixed native and managed stack trace), and see if there’s anything that might jump out in the stack trace. In my test program, I found this one of the times as my stack trace where a FEEE happened (I was purposefully corrupting the heap):Since this could be related to heap corruption from the garbage collector, I would try the
!VerifyHeapcommand. At least you could make sure that the heap is intact (and your problem lies elsewhere) or discover that your issue might actually be with the GC or some P/Invoke routines corrupting it.If you find that the heap is corrupt, I might try and discover how much of the heap is corrupted, which you might be able to do via
!HeapStat. That might just show the entire heap corrupt from a certain point, though.It’s difficult to suggest any other methods to analyze this via WinDbg, since I have no real clue about what your code is doing or how it’s structured.
I suppose if you find it to be an issue with the heap and thus meaning it could be GC weirdness, I would look at the CLR GC events in Event Tracing for Windows.
If the minidumps you’re getting aren’t cutting it and you’re using Windows 7/2008R2 or later, you can use Global Flags (gflags.exe) to attach a debugger when the process terminates without an exception, if you’re not getting a WER notification.
In the
Silent Process Exittab, enter the name of the executable, not the full path to it (ie.TestProgram.exe). Use the following settings:{path to debugging tools}\cdb.exe -server tcp:port=5005 -g -G -p %e.And apply the settings.
When your test program crashes, cdb will attach and wait for you to connect to it. Start WinDbg, type Ctrl+R, and use the connection string:
tcp:port=5005,server=localhost.You might be able to skip using remote debugging and instead use
{path to debugging tools}\windbg.exe %e. However, the reason I suggested remote instead, was becauseWerFault.exe, which I believe is what reads the registry and launches the monitor process, will start the debugger in Session 0.You can make session 0 interactive and connect to the window station, but I can’t remember how that’s done. It’s also inconvenient, because you’d have to switch back and forth between sessions if you need to access any of your existing windows you’ve had open.