Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7190533
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T19:32:11+00:00 2026-05-28T19:32:11+00:00

I installed hadoop 1.0.0 and tried out word counting example (single node cluster). It

  • 0

I installed hadoop 1.0.0 and tried out word counting example (single node cluster). It took 2m 48secs to complete. Then I tried standard linux word count program, which run in 10 milliseconds on the same set (180 kB data). Am I doing something wrong, or is Hadoop very very slow?

time hadoop jar /usr/share/hadoop/hadoop*examples*.jar wordcount someinput someoutput
12/01/29 23:04:41 INFO input.FileInputFormat: Total input paths to process : 30
12/01/29 23:04:41 INFO mapred.JobClient: Running job: job_201201292302_0001
12/01/29 23:04:42 INFO mapred.JobClient:  map 0% reduce 0%
12/01/29 23:05:05 INFO mapred.JobClient:  map 6% reduce 0%
12/01/29 23:05:15 INFO mapred.JobClient:  map 13% reduce 0%
12/01/29 23:05:25 INFO mapred.JobClient:  map 16% reduce 0%
12/01/29 23:05:27 INFO mapred.JobClient:  map 20% reduce 0%
12/01/29 23:05:28 INFO mapred.JobClient:  map 20% reduce 4%
12/01/29 23:05:34 INFO mapred.JobClient:  map 20% reduce 5%
12/01/29 23:05:35 INFO mapred.JobClient:  map 23% reduce 5%
12/01/29 23:05:36 INFO mapred.JobClient:  map 26% reduce 5%
12/01/29 23:05:41 INFO mapred.JobClient:  map 26% reduce 8%
12/01/29 23:05:44 INFO mapred.JobClient:  map 33% reduce 8%
12/01/29 23:05:53 INFO mapred.JobClient:  map 36% reduce 11%
12/01/29 23:05:54 INFO mapred.JobClient:  map 40% reduce 11%
12/01/29 23:05:56 INFO mapred.JobClient:  map 40% reduce 12%
12/01/29 23:06:01 INFO mapred.JobClient:  map 43% reduce 12%
12/01/29 23:06:02 INFO mapred.JobClient:  map 46% reduce 12%
12/01/29 23:06:06 INFO mapred.JobClient:  map 46% reduce 14%
12/01/29 23:06:09 INFO mapred.JobClient:  map 46% reduce 15%
12/01/29 23:06:11 INFO mapred.JobClient:  map 50% reduce 15%
12/01/29 23:06:12 INFO mapred.JobClient:  map 53% reduce 15%
12/01/29 23:06:20 INFO mapred.JobClient:  map 56% reduce 15%
12/01/29 23:06:21 INFO mapred.JobClient:  map 60% reduce 17%
12/01/29 23:06:28 INFO mapred.JobClient:  map 63% reduce 17%
12/01/29 23:06:29 INFO mapred.JobClient:  map 66% reduce 17%
12/01/29 23:06:30 INFO mapred.JobClient:  map 66% reduce 20%
12/01/29 23:06:36 INFO mapred.JobClient:  map 70% reduce 22%
12/01/29 23:06:37 INFO mapred.JobClient:  map 73% reduce 22%
12/01/29 23:06:45 INFO mapred.JobClient:  map 80% reduce 24%
12/01/29 23:06:51 INFO mapred.JobClient:  map 80% reduce 25%
12/01/29 23:06:54 INFO mapred.JobClient:  map 86% reduce 25%
12/01/29 23:06:55 INFO mapred.JobClient:  map 86% reduce 26%
12/01/29 23:07:02 INFO mapred.JobClient:  map 90% reduce 26%
12/01/29 23:07:03 INFO mapred.JobClient:  map 93% reduce 26%
12/01/29 23:07:07 INFO mapred.JobClient:  map 93% reduce 30%
12/01/29 23:07:09 INFO mapred.JobClient:  map 96% reduce 30%
12/01/29 23:07:10 INFO mapred.JobClient:  map 96% reduce 31%
12/01/29 23:07:12 INFO mapred.JobClient:  map 100% reduce 31%
12/01/29 23:07:22 INFO mapred.JobClient:  map 100% reduce 100%
12/01/29 23:07:28 INFO mapred.JobClient: Job complete: job_201201292302_0001
12/01/29 23:07:28 INFO mapred.JobClient: Counters: 29
12/01/29 23:07:28 INFO mapred.JobClient:   Job Counters 
12/01/29 23:07:28 INFO mapred.JobClient:     Launched reduce tasks=1
12/01/29 23:07:28 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=275346
12/01/29 23:07:28 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/01/29 23:07:28 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/01/29 23:07:28 INFO mapred.JobClient:     Launched map tasks=30
12/01/29 23:07:28 INFO mapred.JobClient:     Data-local map tasks=30
12/01/29 23:07:28 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=137186
12/01/29 23:07:28 INFO mapred.JobClient:   File Output Format Counters 
12/01/29 23:07:28 INFO mapred.JobClient:     Bytes Written=26287
12/01/29 23:07:28 INFO mapred.JobClient:   FileSystemCounters
12/01/29 23:07:28 INFO mapred.JobClient:     FILE_BYTES_READ=71510
12/01/29 23:07:28 INFO mapred.JobClient:     HDFS_BYTES_READ=89916
12/01/29 23:07:28 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=956282
12/01/29 23:07:28 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=26287
12/01/29 23:07:28 INFO mapred.JobClient:   File Input Format Counters 
12/01/29 23:07:28 INFO mapred.JobClient:     Bytes Read=85860
12/01/29 23:07:28 INFO mapred.JobClient:   Map-Reduce Framework
12/01/29 23:07:28 INFO mapred.JobClient:     Map output materialized bytes=71684
12/01/29 23:07:28 INFO mapred.JobClient:     Map input records=2574
12/01/29 23:07:28 INFO mapred.JobClient:     Reduce shuffle bytes=71684
12/01/29 23:07:28 INFO mapred.JobClient:     Spilled Records=6696
12/01/29 23:07:28 INFO mapred.JobClient:     Map output bytes=118288
12/01/29 23:07:28 INFO mapred.JobClient:     CPU time spent (ms)=39330
12/01/29 23:07:28 INFO mapred.JobClient:     Total committed heap usage (bytes)=5029167104
12/01/29 23:07:28 INFO mapred.JobClient:     Combine input records=8233
12/01/29 23:07:28 INFO mapred.JobClient:     SPLIT_RAW_BYTES=4056
12/01/29 23:07:28 INFO mapred.JobClient:     Reduce input records=3348
12/01/29 23:07:28 INFO mapred.JobClient:     Reduce input groups=1265
12/01/29 23:07:28 INFO mapred.JobClient:     Combine output records=3348
12/01/29 23:07:28 INFO mapred.JobClient:     Physical memory (bytes) snapshot=4936278016
12/01/29 23:07:28 INFO mapred.JobClient:     Reduce output records=1265
12/01/29 23:07:28 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=26102546432
12/01/29 23:07:28 INFO mapred.JobClient:     Map output records=8233

real    2m48.886s
user    0m3.300s
sys 0m0.304s


time wc someinput/*
  178  1001  8674 someinput/capacity-scheduler.xml
  178  1001  8674 someinput/capacity-scheduler.xml.bak
    7     7   196 someinput/commons-logging.properties
    7     7   196 someinput/commons-logging.properties.bak
   24    35   535 someinput/configuration.xsl
   80   122  1968 someinput/core-site.xml
   80   122  1972 someinput/core-site.xml.bak
    1     0     1 someinput/dfs.exclude
    1     0     1 someinput/dfs.include
   12    36   327 someinput/fair-scheduler.xml
   45   192  2141 someinput/hadoop-env.sh
   45   192  2139 someinput/hadoop-env.sh.bak
   20   137   910 someinput/hadoop-metrics2.properties
   20   137   910 someinput/hadoop-metrics2.properties.bak
  118   582  4653 someinput/hadoop-policy.xml
  118   582  4653 someinput/hadoop-policy.xml.bak
  241   623  6616 someinput/hdfs-site.xml
  241   623  6630 someinput/hdfs-site.xml.bak
  171   417  6177 someinput/log4j.properties
  171   417  6177 someinput/log4j.properties.bak
    1     0     1 someinput/mapred.exclude
    1     0     1 someinput/mapred.include
   12    15   298 someinput/mapred-queue-acls.xml
   12    15   298 someinput/mapred-queue-acls.xml.bak
  338   897  9616 someinput/mapred-site.xml
  338   897  9630 someinput/mapred-site.xml.bak
    1     1    10 someinput/masters
    1     1    18 someinput/slaves
   57    89  1243 someinput/ssl-client.xml.example
   55    85  1195 someinput/ssl-server.xml.example
 2574  8233 85860 total

real    0m0.009s
user    0m0.004s
sys 0m0.000s
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T19:32:12+00:00Added an answer on May 28, 2026 at 7:32 pm

    This depends on a large number of factors, including your configuration, your machine, memory config, JVM settings, etc. You also need to subtract JVM startup time.

    It runs much more quickly for me. That said, of course it will be slower on small data sets than a dedicated C program–consider what it’s doing “behind the scenes”.

    Try it on a terabyte of data spread across a few thousand files and see what happens.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have sucessfully installed Hypertable on top of Hadoop on a small cluster of
I have setup a VM with ubuntu. It runs hadoop as a single node.
I installed PyScript to try it out but it just wont start. It only
I have Installed hadoop and hbase cdh3u2. In hadoop i have a file at
Has anyone tried to install HUE on Apache Hadoop? We are using hadoop 0.20.2
Installed: SharePoint Server 2010 for Internet Enterprise Beta (x64) On: Windows Server 2008 Standard
I installed the Maven plugin for Eclipse , and then I got an error
Installed Ubuntu 11.10, removed Ruby1.8.7 completely, and then installed Ruby1.9.2-p290 from source, followed by
I have created an AMI image and installed Hadoop from the Cloudera CDH2 build.
I installed WSS Infrastructure Update and MOSS Infrastructure Update ( http://technet.microsoft.com/en-us/office/sharepointserver/bb735839.aspx ) and now

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.