I have an apparent memory leak in a hadoop program I’m running. Specifically I

Question

0

Asked: June 15, 20262026-06-15T11:54:03+00:00 2026-06-15T11:54:03+00:00

I have an apparent memory leak in a hadoop program I’m running. Specifically I

0

I have an apparent memory leak in a hadoop program I’m running. Specifically I get the message:
ERROR GC overhead limit exceeded
followed later by the exception

attempt_201210041336_0765_m_0000000_1: Exception in thread "Tread for syncLogs" java.lang.OutOfMemoryError: GC overhead limit exceeded
attempt_201210041336_0765_m_0000000_1: at java.util.Vector.elements (Vector.java:292)
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.helpers.AppenderAtachableImpl.getAllAppenders(AppenderAttachableImpl.java:84
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.Category.getAllAppenders (Category.java:415)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:256)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.Child$3.run(Child.java:157)

I’m running on what should be very small data sets in an initial trial, so I shouldn’t be hitting any memory limit. More to the point I don’t want to change the hadoop configuration; if the program can’t run with the current configuration the program needs rewritten.

Can anyone help me figure out how to diagnose this issue? ise there a command line argument to get a stack trace of memory usage? any other way of tracking this issue?

ps. I wrote the error message by hand, can’t copy-paste from the system that has the issue. So please ignore any typo as being my stupid fault.

edit: update to this. I ran the job a few more times; while I always get the
Error GC overhead limit exceeded
message I don’t always get the stacktrace for log4j. So the issue is probably not log4j, instead log4j happened to fail due to the lack of memory caused by…something else?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T11:54:05+00:00

Editorial Team

2026-06-15T11:54:05+00:00Added an answer on June 15, 2026 at 11:54 am

“GC overhead limit exceeded” probably means that a lot of short-lived objects are being created, more than the GC can handle without consuming more than 98% of the total time. See this question on how to find the problematic classes and allocation spots with JProfiler.

Disclaimer: My company develops JProfiler.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an apparent memory leak in a hadoop program I’m running. Specifically I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply