A part of an application I’m working on is a simple pthread-based server that communicates over a TCP/IP socket. I am writing it in C because it’s going to be running in a memory constrained environment. My question is: what should the program do if one of the threads encounters a malloc() that returns NULL? Possibilities I’ve come up with so far:
- No special handling. Let malloc() return NULL and let it be dereferenced so that the whole thing segfaults.
- Exit immediately on a failed malloc(), by calling abort() or exit(-1). Assume that the environment will clean everything up.
- Jump out of the main event loop and attempt to pthread_join() all the threads, then shut down.
The first option is obviously the easiest, but seems very wrong. The second one also seems wrong since I don’t know exactly what will happen. The third option seems tempting except for two issues: first, all of the threads need not be joined back to the main thread under normal circumstances and second, in order to complete the thread execution, most of the remaining threads will have to call malloc() again anyway.
What shall I do?
This is one of the reason that space / rad hard systems generally forbid dynamic memory allocation. When
malloc()fails, its extremely hard to ‘cure’ the failure. You do have some options:malloc()(at all, or as usual). You can wrap malloc() to do extra work on failures, such as notifying something else. This is helpful when using something like a watchdog. You can also use a full blown garbage collector, though I don’t recommend it. Its better to identify and fix leaks.malloc()that won’t oversell it. If you have profiled your heap usage extensively (using a tool like Valgrind’s massif or similar), you can reasonably size the pool.However, what most of those suggestions boil down to is not trusting / using the system
malloc()if failure is not an option.In your case, I think the best thing you can do is make sure a watchdog is notified in the event that
malloc()fails, so that your process (or the whole system) can be re-started. You don’t want it looking ‘alive and running’ while in deadlock. This could be as simple as just unlinking a file.Write very detailed logs. What file / line / function did the failure happen?
If
malloc()fails when trying to get just a few KB, its a good sign that your process really can’t continue reliably anyway. If it fails grabbing a few hundred MB, you may be able to recover and keep going. By that token, whatever action you take should be based on just how much memory you were trying to get, and if calls to allocate a much smaller size still succeed.The one thing you never want to do is just operate on NULL pointers and let it crash. Its just sloppy, provides no useful logging of where things went wrong and gives the impression that your software is of low / unstable quality.