I have created an AMI image and installed Hadoop from the Cloudera CDH2 build.

Question

0

Asked: May 18, 20262026-05-18T01:09:55+00:00 2026-05-18T01:09:55+00:00

I have created an AMI image and installed Hadoop from the Cloudera CDH2 build.

0

I have created an AMI image and installed Hadoop from the Cloudera CDH2 build. I configured my core-site.xml as so:

<property>
   <name>fs.default.name</name>
   <value>s3://<BUCKET NAME>/</value>
 </property>
 <property>
    <name>fs.s3.awsAccessKeyId</name>
    <value><ACCESS ID></value>
 </property>
 <property>
    <name>fs.s3.awsSecretAccessKey</name>
    <value><SECRET KEY></value>
 </property>
 <property>
    <name>hadoop.tmp.dir</name>
    <value>/var/lib/hadoop-0.20/cache/${user.name}</value>
 </property>

But I get the following error message when I start up the hadoop daemons in the namenode log:

2010-11-03 23:45:21,680 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode:      java.lang.IllegalArgumentException: Invalid URI for NameNode address (check     fs.default.name): s3://<BUCKET NAME>/ is not of scheme 'hdfs'.
    at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:177)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:198)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1006)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1015)

2010-11-03 23:45:21,691 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

However, I am able to execute hadoop commands from the command line like so:

hadoop fs -put sun-javadb-common-10.5.3-0.2.i386.rpm s3://<BUCKET NAME>/

 hadoop fs -ls s3://poc-jwt-ci/
Found 3 items
drwxrwxrwx   -          0 1970-01-01 00:00 /
-rwxrwxrwx   1      16307 1970-01-01 00:00 /sun-javadb-common-10.5.3-0.2.i386.rpm
drwxrwxrwx   -          0 1970-01-01 00:00 /var

You will notice there is a / and a /var folders in the bucket. I ran the hadoop namenode -format when I first saw this error, then restarted all services, but still receive the weird Invalid URI for NameNode address (check fs.default.name): s3://<BUCKET NAME>/ is not of scheme 'hdfs'.

I also notice that the file system created looks like this:

 hadoop fs -ls s3://<BUCKET NAME>/var/lib/hadoop-0.20/cache/hadoop/mapred/system
Found 1 items
-rwxrwxrwx   1          4 1970-01-01 00:00 /var/lib/hadoop0.20/cache/hadoop/mapred/system/jobtracker.info

Any ideas of what’s going on?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T01:09:56+00:00

First I suggest you just use Amazon Elastic MapReduce. There is zero configuration required on your end. EMR also has a few internal optimizations and monitoring that works in your benefit.

Second, do not use s3: as your default FS. First, s3 is too slow to be used to store intermediate data between jobs (a typical unit of work in hadoop is a dozen to dozens of MR jobs). it also stores the data in a ‘proprietary’ format (blocks etc). So external apps can’t effectively touch the data in s3.

Note that s3: in EMR is not the same s3: in the standard hadoop distro. The amazon guys actually alias s3: as s3n: (s3n: is just raw/native s3 access).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have created an AMI image and installed Hadoop from the Cloudera CDH2 build.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply