I’ve run into a situation where, according to a minidump, certain files are causing a stack overflow in a recursive-descent parser. Unfortunately I can’t get my hands on an example of a file that does this in order to reproduce the issue (the client has confidentiality concerns), which leaves me a bit hamstrung on diagnosing the real problem for the moment.
Clearly the parser needs some attention, but right now my top priority is to just keep the program running. As a stopgap measure, what can I do to keep this from bringing down the whole program?
My first choice would be to find some way to anticipate that I’m running out of room on the stack so that I can gracefully abort the parser before the overflow happens. Failing to parse the file is an acceptable option. The second choice would be to let it happen, catch the error and log it, then continue with the rest of the data.
The parsing is happening in a Parallel.ForEach() loop. I’m willing to swap that out for some other approach if that will help.
EDIT: What would be really killer is if I could just get the size of the current thread’s stack, and the position of the stack pointer. Is this possible?
EDIT 2: I finally managed to wring a sample file out of someone and trap the error in a debugger. It turns out it’s not code that belongs to us at all – the exception’s happening somewhere in HtmlAgilityPack. So it looks like I’m going to have to try and find a completely different tack.
Stack has 1 MB limit by default on desktop CLR, but you can increase it.
You can use a continuation passing style to use heap instead of stack.
In C# 5.0, there’s async mechanism provided by compiler that automates this process. I haven’t tried this with the latest build. As mentioned by Alex, there is no support for tail-call optimization in C#, and this might be big enough of a reason to adopt F# for parsing problems. Here’s some material on lexing and parsing with F#. YMMV, as demonstrated in this article.
You’d also need graph cycle detection to make your program solid in the presence of bad inputs.
As a way to collect more info, you can needle through an accumulator integer that tracks how deep is your call stack. This will not directly translate into memory consumed by said call stack, but it gives you a general idea. For example, you could throw and catch your own exception when that number is greater than some user-configurable or predefined threshold.
and then at the call-site:
As requested, I’ll link you up to the fabulous blog by Eric Lippert on this very topic.