In a comment on the question Automatically release mutex on crashes in Unix back in 2010, jilles claimed:
glibc’s robust mutexes are so fast because glibc takes dangerous shortcuts. There is no guarantee that the mutex still exists when the kernel marks it as “will cause EOWNERDEAD”. If the mutex was destroyed and the memory replaced by a memory mapped file that happens to contain the last owning thread’s ID at the right place and the last owning thread terminates just after writing the lock word (but before fully removing the mutex from its list of owned mutexes), the file is corrupted. Solaris and will-be-FreeBSD9 robust mutexes are slower because they do not want to take this risk.
I can’t make any sense of the claim, since destroying a mutex is not legal unless it’s unlocked (and thus not in any thread’s robust list). I also can’t find any references searching for such a bug/issue. Was the claim simply erroneous?
The reason I ask and that I’m interested is that this is relevant to the correctness of my own implementation built upon the same Linux robust-mutex primitive.
The description of the race by FreeBSD pthread developer David Xu: http://lists.freebsd.org/pipermail/svn-src-user/2010-November/003668.html
I don’t think the munmap/mmap cycle is strictly required for the race. The piece of shared memory might be put to a different use as well. This is uncommon but valid.
As also mentioned in that message, more “fun” occurs if threads with different privilege access a common robust mutex. Because the node for the list of owned robust mutexes is in the mutex itself, a thread with low privilege may corrupt a high privilege thread’s list. This could be exploited easily to make the high privilege thread crash and in rare cases this might allow the high privilege thread’s memory to be corrupted. Apparently Linux’s robust mutexes are only designed for use by threads with the same privileges. This could have been avoided easily by making the robust list an array fully in the thread’s memory instead of a linked list.