I’m looking for a profiler than can profile a Java6 application running on a separate linux box (with no windows manager).
The application is a latency sensitive, multithreaded server that typically responds to incoming network events in several hundred microseconds (less than 1 millisecond). I’m interested in learning about hot sections of code and contended locks, I’m less interested in memory usage patterns.
I’m not concerned about the profiling overhead during the profiling run, I expect there to be a performance hit.
Yourkit is very good, except I would say that profilers in general are not very useful for exampling sub-millisecond latency applications.
However, if you haven’t looked at your memory usage, then this where I would start. How are you ensuring you minimise; object creation, cache misses, context switching overhead (from passing data between threads)? Is all your code warmed up? i.e. so you are not hitting any interperated code.
I suggest you timestamp with nanoTime() the key execution path in you application to record the timing of each request at key stages to see where you are experiencing the most delay.
BTW: It is possible to get a Java application with sub 100 micro-seconds response time.
BTW2: It is possible to reduce the number of Full GCs during the day and have the system Full GC only at night. By increasing the eden size you might get to the point where you have no minor collections either.