I have a Java program that runs on my Ubuntu 10.04 machine and, without any user interaction, repeatedly queries a MySQL database and then constructs img- and txt-files according to the data read from the DB. It makes tens of thousands of queries and creates tens of thousands of files.
After some hours of running, the available memory on my machine including swap space is totally used up. I haven’t started other programs and the processes running in the background don’t consume much memory and don’t really grow in consumption.
To find out what is allocating so much memory I wanted to analyse a heap dump, so I started the process with -Xms64m -Xmx128m -XX:+HeapDumpOnOutOfMemoryError.
To my surprise, the situation was the same as before, after some hours the program was allocating all of the swap which is way beyond the given max of 128m.
Another run debugged with VisualVM showed that the heap allocation never is beyond the max of 128m – when the allocated memory is approximating the max, a big part of it is released again (I assume by the garbage collector).
So, it cannot be a problem a steadily growing heap.
When the memory is all used up:
free shows the following:
total used free shared buffers cached
Mem: 2060180 2004860 55320 0 848 1042908
-/+ buffers/cache: 961104 1099076
Swap: 3227640 3227640 0
top shows the following:
USER VIRT RES SHR COMMAND
[my_id] 504m 171m 4520 java
[my_id] 371m 162m 4368 java
(by far the two “biggest” processes and the only java processes running)
My first question is:
- How can I find out on the OS level (e.g. with command line tools) what is allocating so much memory? top / htop hasn’t helped me. In case of many, many tiny processes of the same type eating up the memory: is there a way to intelligently sum up similar processes? (I know that is probably off topic as it is a Linux/Ubuntu question, but my main problem may still be Java-related)
My old questions were:
- Why isn’t the memory consumption of my program given in the top output?
- How can I find out what is allocating so much memory?
- If the heap isn’t the problem, is the only “allocating factor” the stack? (the
stack shouldn’t be a problem as there is no deep “method call depth”) - What about external resources as DB connections?
As there was no activity after the day I asked the question (until March 23) and as I still couldn’t find the cause for the memory consumption I “solved” the problem pragmatically.
The program causing the problem is basically a repetition of a “task” (i.e. querying a DB and then creating files). It is relatively easy to parameterize the program so that a certain subset of tasks is executed and not all of them.
So now I repeatedly run my program from a shell script, in each process executing only a set of tasks (parameterized through arguments). In the end, all tasks are being executed, but as a single process only processes a subset of tasks there are no memory issues any more.
For me that is a sufficient solution. If you have a similar problem and your program has a batch-like execution structure this may be a pragmatic approach.
When I find the time I will look into the new suggestions hopefully identifying the root cause (thanks for the help!).