The Hadoop HDFS Balancer can balance usage % among nodes. However, I couldn’t find a way to balance remaining % (i.e. empty space %).
E.g. Let’s say I have 3 nodes, with 2 dedicated to HDFS and 1 with other shared disk usage. The balancer will balance the usage % among the 3 nodes. As such, the shared node will always run out of disk space first because it has other files on disk.
What I want is to balance the remaining disk space %.
This wiki page explains how to reserve non DFS usage per data node.
http://wiki.apache.org/hadoop/DiskSetup