We hit a strange issue on one of customers servers, where Java encounters “Too many files”,
Checking the descriptors via lsof produces a large list of “sock” descriptors with “can’t identify protocol”.
I suspect it happens due to sockets that opened for too much time, but as our thread dump contains a lot of them, I have no clear idea who exactly the culprit.
Is there any good method to detect which threads exactly open these sockets?
Thanks.
Not the threads per se.
One approach is to run the application using a profiler. This could well find the problem even if you cannot exactly reproduce the customer’s problem. (@SyBer reports that the YourKit profiler has specific support for finding socket leaks … see comment.)
A second approach is to tweak your test platform by using
ulimitto REDUCE the number of open files allowed. This may make it easier to reproduce the “too many files open” scenario in your test environment.Finally, I’d recommend “grepping” your codebase to find all places where socket objects are created. Then examine them all to make sure they use correctly try / finally blocks to ensure that the sockets are always closed.