I need to setup a hadoop/hdfs cluster with one namenode and two datanodes. I am aware of conf/slaves file which lists the machines datanodes are running. But how can I specify where hadoop/hdfs is locally installed on slave node? Also the user account to start hdfs there?
Edit: in log files, I find following error, when I tried to start-dfs.sh
ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: file:///
The user is expected to be the same as on the master node. The location of the actual data can be modified by changing the
dfs.data.dirnode inhadoop-site.xml.