Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7731321
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T06:27:05+00:00 2026-06-01T06:27:05+00:00

I have used ElasticMapReduce for some time. It is quite convenient but I can’t

  • 0

I have used ElasticMapReduce for some time. It is quite convenient but I can’t run HBase since Hadoop cluster is only temporarily available (I have asked somewhat related question at HBase and Hadoop).

So I want to try out installing Hadoop on a set of EC2 machines. I know Hadoop has some EC2 related directory – src/contrib/ec2. It looks like a Hadoop cluster can be launched simply by typing a command and I can log into a master node to run jobs and so on. Before trying this, I would like to know any gotchas from ppl who have been using this. Thanks!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T06:27:06+00:00Added an answer on June 1, 2026 at 6:27 am

    Indeed there are two options of using hadoop on amazon – provisioning of you own cluster or usint EMR. Orthogonal to this decision you can use HDFS or S3 as your file system.
    It is not short story but I will try to highligt some pros/cons of all these choices.
    You can use EMR if you need to run single / few jobs a day and do not need hadoop cluster all the time. In this case you put your data into s3 and can fully script the process. Main disadvatage – it is not easy to customize, use third party libraries etc. In this case you also save time of installing the cluster.
    If you want to tweak hadoop – you should install your own cluster.
    When your data is already in s3 or you need to store it after processings – s3 is a good choice. In the same time – you will get probabbly less performance then using HDFS. Have to be stated that amazon instances has very little local storage – so it get really expensive and you should keep cluster running (and pay for it) just to preserve this storage.
    I would tell that if you indeed need HDFS with all its throuput you indeed need own cluster on own hardware. When you working on Amazon – it is most practical to use S3 as your file system.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have used some TCL, but this construction stumps me. When $res = Table
I have used 3 some jquery related files namely <script type=text/javascript src=/blog/jquery/jquery-1.7.1.min.js></script> <link type=text/css
I have used microsoft system.security.cryptography to make md5 in c# application but I need
I have used following code to add three variables but instead of adding these
I have used initialization lists a great deal in my C++ programs but wasn't
i have used filter to get the gradient and border-radius for round border but
I have used Broadcast receiver when it restarts it shows the launcher screen but
Have used google maps dozens of times but cannot get around this one. Doing
I have used cron to run PHP: * * * * * /var/www/html/new12345/testing.sh This
I have used some code from Apple's 'Camera Programming Topics for iOS' guide as

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.