Information about the application:
- Linux – 2.4.1 Kernel
- m68k based embedded application
- Single process multithreaded application
We have an application where we have implemented the connection for the SIGSEGV with a segmentation_handler function. In this segmentation handler we create a file, do a file write (like “obtained stack frame”), then using backtrace and symbols write all the stack trace into the same file.
Problem: We get a SIGSEGV (confirmed due to creation of the log file) but unfortunately the file is empty (0kb file) with no information in it. (Even the first string which is a plain string is not available in the file).
I want to understand in what scenarios such a thing can happen because we can solve the crash if we get the stack trace, but we don’t have it and the mechanism to get it did not work either 🙁
void segmentation_handler(int signal_no) {
char buffer[512]; .............
InitLog();//Create a log file
printf("\n*** segmentation fault occured ***\n");
fflush(stdout);
memset(buffer, 0, 512);
size = backtrace (array, 50);
strings = backtrace_symbols (array, size);
sprintf(buffer, "Obtained %d stack frames.\n", size);
Log(buffer);// Write the buffer into the file
for (n = 0; n < size; n++) {
sprintf(buffer, "%s\n", strings[n]); Log(buffer);
}
CloseLog();
}
Your segmentation handler is very naive and contains multiple errors. Here is a short list:
You are calling fprintf() and multiple other functions which are not async signal safe. Consider, fprintf uses a lock internally to synch multiple calls to the same file descriptor from multiple threads. What if your segmentation fault was in the middle of printf and the lock was taken? you would dead lock in the middle of the segmentation handlers…
You are allocating memory (call to backtrace_symbols), but if the segmentation fault was due to malloc arena corruption (a very likely cause of segmentation violations) you would double fault inside the segmentation handler.
If multiple threads cause an exception in the same time the code will open multiple times the file and run over the log.
There are other problems, but these are the basics…
There is a video on my lecture on how to write proper fault handlers available here: http://free-electrons.com/pub/video/2008/ols/ols2008-gilad-ben-yossef-fault-handlers.ogg