What is the difference between setting the mapred.job.map.memory.mb and mapred.child.java.opts using -Xmx to control the maximum memory used by a Mapper and Reduce task? Which one takes precedence?
What is the difference between setting the mapred.job.map.memory.mb and mapred.child.java.opts using -Xmx to control
Share
-Xmxspecify the maximum heap space of the allocated jvm. This is the space reserved for object allocation that is managed by the garbage collector. On the other hand,mapred.job.map.memory.mbspecifies the maximum virtual memory allowed by a Hadoop task subprocess. If you exceed the max heap size, the JVM throws an OutOfMemoryException.The JVM may use more memory than the max heap size because it also needs space to store object definitions (permgen space) and the stack. If the process uses more virtual memory than
mapred.job.map.memory.mbit is killed by hadoop.So one doesn’t take precedence over the other (and they measure different aspects of memory usage), but
-Xmxis a parameter to the JVM andmapred.job.map.memory.mbis a hard upper-bound of the virtual memory a task attempt can use, enforced by hadoop.Hope this is helpful, memory is complicated! I’m presently confused by why my JVM processes use several multiples of the max heap size in virtual memory in my SO post.