I have a single node instance of Apache Hadoop 1.1.1 with default parameter values

Question

0

Asked: June 15, 20262026-06-15T17:30:33+00:00 2026-06-15T17:30:33+00:00

I have a single node instance of Apache Hadoop 1.1.1 with default parameter values

0

I have a single node instance of Apache Hadoop 1.1.1 with default parameter values (see e.g. [1] and [2]) on the machine with a lot of RAM and very limited free disk space size. Then, I notice that this Hadoop instance wastes a lot of disk space during map tasks. What configuration parameters should I pay attention to in order to take advantage of high RAM capacity and decrease disk space usage?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T17:30:34+00:00

You can use several of the mapred.* params to compress map output, which will greatly reduce the amount of disk space needed to store mapper output. See this question for some good pointers.

Note that different compression codecs will have different issues (i.e. GZip needs more CPU than LZO, but you have to install LZO yourself). This page has a good discussion of compression issues in Hadoop, although it is a bit dated.

The amount of RAM you need depends upon what you are doing in your map-reduce jobs, although you can increase your heap-size in:

conf/mapred-site.xml mapred.map.child.java.opts

See cluster setup for more details on this.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a single node instance of Apache Hadoop 1.1.1 with default parameter values

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply