Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8335603
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T03:43:24+00:00 2026-06-09T03:43:24+00:00

I’m running Apache’s Hadoop, and using the grep example provided by that installation. I’m

  • 0

I’m running Apache’s Hadoop, and using the grep example provided by that installation. I’m wondering why map reduce percentages show up running twice? I thought they only had to run once; which makes me doubt my understanding of map reduce. I looked it up (http://grokbase.com/t/gg/mongodb-user/125ay1eazq/map-reduce-percentage-seems-running-twice) but there really wasn’t an explanation and this link was for MongoDB.

hduser@ubse1:/usr/local/hadoop$ bin/hadoop jar hadoop*examples*.jar grep /user/hduser/grep /user/hduser/grep-output4 ".*woe is me.*"

I’m running this on a project gutenberg .txt file. The output file is correct.

Here is the output for running the command if needed:

12/08/06 06:56:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/08/06 06:56:57 WARN snappy.LoadSnappy: Snappy native library not loaded
12/08/06 06:56:57 INFO mapred.FileInputFormat: Total input paths to process : 1
12/08/06 06:56:58 INFO mapred.JobClient: Running job: job_201208030925_0011
12/08/06 06:56:59 INFO mapred.JobClient:  map 0% reduce 0%
12/08/06 06:57:18 INFO mapred.JobClient:  map 100% reduce 0%
12/08/06 06:57:30 INFO mapred.JobClient:  map 100% reduce 100%
12/08/06 06:57:35 INFO mapred.JobClient: Job complete: job_201208030925_0011
12/08/06 06:57:35 INFO mapred.JobClient: Counters: 30
12/08/06 06:57:35 INFO mapred.JobClient:   Job Counters 
12/08/06 06:57:35 INFO mapred.JobClient:     Launched reduce tasks=1
12/08/06 06:57:35 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=31034
12/08/06 06:57:35 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/08/06 06:57:35 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/08/06 06:57:35 INFO mapred.JobClient:     Rack-local map tasks=2
12/08/06 06:57:35 INFO mapred.JobClient:     Launched map tasks=2
12/08/06 06:57:35 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=11233
12/08/06 06:57:35 INFO mapred.JobClient:   File Input Format Counters 
12/08/06 06:57:35 INFO mapred.JobClient:     Bytes Read=5592666
12/08/06 06:57:35 INFO mapred.JobClient:   File Output Format Counters 
12/08/06 06:57:35 INFO mapred.JobClient:     Bytes Written=391
12/08/06 06:57:35 INFO mapred.JobClient:   FileSystemCounters
12/08/06 06:57:35 INFO mapred.JobClient:     FILE_BYTES_READ=281
12/08/06 06:57:35 INFO mapred.JobClient:     HDFS_BYTES_READ=5592862
12/08/06 06:57:35 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=65331
12/08/06 06:57:35 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=391
12/08/06 06:57:35 INFO mapred.JobClient:   Map-Reduce Framework
12/08/06 06:57:35 INFO mapred.JobClient:     Map output materialized bytes=287
12/08/06 06:57:35 INFO mapred.JobClient:     Map input records=124796
12/08/06 06:57:35 INFO mapred.JobClient:     Reduce shuffle bytes=287
12/08/06 06:57:35 INFO mapred.JobClient:     Spilled Records=10
12/08/06 06:57:35 INFO mapred.JobClient:     Map output bytes=265
12/08/06 06:57:35 INFO mapred.JobClient:     Total committed heap usage (bytes)=336404480
12/08/06 06:57:35 INFO mapred.JobClient:     CPU time spent (ms)=7040
12/08/06 06:57:35 INFO mapred.JobClient:     Map input bytes=5590193
12/08/06 06:57:35 INFO mapred.JobClient:     SPLIT_RAW_BYTES=196
12/08/06 06:57:35 INFO mapred.JobClient:     Combine input records=5
12/08/06 06:57:35 INFO mapred.JobClient:     Reduce input records=5
12/08/06 06:57:35 INFO mapred.JobClient:     Reduce input groups=5
12/08/06 06:57:35 INFO mapred.JobClient:     Combine output records=5
12/08/06 06:57:35 INFO mapred.JobClient:     Physical memory (bytes) snapshot=464568320
12/08/06 06:57:35 INFO mapred.JobClient:     Reduce output records=5
12/08/06 06:57:35 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1539559424
12/08/06 06:57:35 INFO mapred.JobClient:     Map output records=5
12/08/06 06:57:35 INFO mapred.FileInputFormat: Total input paths to process : 1
12/08/06 06:57:35 INFO mapred.JobClient: Running job: job_201208030925_0012
12/08/06 06:57:36 INFO mapred.JobClient:  map 0% reduce 0%
12/08/06 06:57:50 INFO mapred.JobClient:  map 100% reduce 0%
12/08/06 06:58:05 INFO mapred.JobClient:  map 100% reduce 100%
12/08/06 06:58:10 INFO mapred.JobClient: Job complete: job_201208030925_0012
12/08/06 06:58:10 INFO mapred.JobClient: Counters: 30
12/08/06 06:58:10 INFO mapred.JobClient:   Job Counters 
12/08/06 06:58:10 INFO mapred.JobClient:     Launched reduce tasks=1
12/08/06 06:58:10 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=15432
12/08/06 06:58:10 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/08/06 06:58:10 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/08/06 06:58:10 INFO mapred.JobClient:     Rack-local map tasks=1
12/08/06 06:58:10 INFO mapred.JobClient:     Launched map tasks=1
12/08/06 06:58:10 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=14264
12/08/06 06:58:10 INFO mapred.JobClient:   File Input Format Counters 
12/08/06 06:58:10 INFO mapred.JobClient:     Bytes Read=391
12/08/06 06:58:10 INFO mapred.JobClient:   File Output Format Counters 
12/08/06 06:58:10 INFO mapred.JobClient:     Bytes Written=235
12/08/06 06:58:10 INFO mapred.JobClient:   FileSystemCounters
12/08/06 06:58:10 INFO mapred.JobClient:     FILE_BYTES_READ=281
12/08/06 06:58:10 INFO mapred.JobClient:     HDFS_BYTES_READ=505
12/08/06 06:58:10 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=42985
12/08/06 06:58:10 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=235
12/08/06 06:58:10 INFO mapred.JobClient:   Map-Reduce Framework
12/08/06 06:58:10 INFO mapred.JobClient:     Map output materialized bytes=281
12/08/06 06:58:10 INFO mapred.JobClient:     Map input records=5
12/08/06 06:58:10 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/08/06 06:58:10 INFO mapred.JobClient:     Spilled Records=10

EDIT Driver Class for Grep:
Grep.java

/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements.  See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership.  The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License.  You may obtain a copy of the License at
*
*     http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.examples;

import java.util.Random;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.mapred.lib.*;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

/* Extracts matching regexs from input files and counts them. */
public class Grep extends Configured implements Tool {
 private Grep() {} // singleton

 public int run(String[] args) throws Exception {
 if (args.length < 3) {
 System.out.println("Grep <inDir> <outDir> <regex> [<group>]");
 ToolRunner.printGenericCommandUsage(System.out);
 return -1;
 }

 Path tempDir =
 new Path("grep-temp-"+
 Integer.toString(new Random().nextInt(Integer.MAX_VALUE)));

 JobConf grepJob = new JobConf(getConf(), Grep.class);

 try {

 grepJob.setJobName("grep-search");
 FileInputFormat.setInputPaths(grepJob, args[0]);

 grepJob.setMapperClass(RegexMapper.class);
 grepJob.set("mapred.mapper.regex", args[2]);
 if (args.length == 4)
 grepJob.set("mapred.mapper.regex.group", args[3]);

 grepJob.setCombinerClass(LongSumReducer.class);
 grepJob.setReducerClass(LongSumReducer.class);

 FileOutputFormat.setOutputPath(grepJob, tempDir);
 grepJob.setOutputFormat(SequenceFileOutputFormat.class);
 grepJob.setOutputKeyClass(Text.class);
 grepJob.setOutputValueClass(LongWritable.class);

 JobClient.runJob(grepJob);

 JobConf sortJob = new JobConf(getConf(), Grep.class);
 sortJob.setJobName("grep-sort");

 FileInputFormat.setInputPaths(sortJob, tempDir);
sortJob.setInputFormat(SequenceFileInputFormat.class);

 sortJob.setMapperClass(InverseMapper.class);

 sortJob.setNumReduceTasks(1); // write a single file
 FileOutputFormat.setOutputPath(sortJob, new Path(args[1]));
 sortJob.setOutputKeyComparatorClass // sort by decreasing freq
 (LongWritable.DecreasingComparator.class);

 JobClient.runJob(sortJob);
 }
 finally {
 FileSystem.get(grepJob).delete(tempDir, true);
 }
 return 0;
 }

 public static void main(String[] args) throws Exception {
 int res = ToolRunner.run(new Configuration(), new Grep(), args);
 System.exit(res);
 }

}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T03:43:26+00:00Added an answer on June 9, 2026 at 3:43 am

    In the file there are the statistics of two jobs: job: job_201208030925_0011 and job: job_201208030925_0012. The percentages belong to these two jobs, hence there are 2 map progress percentages.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have a French site that I want to parse, but am running into
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I have a .ini file as follows: [playlist] numberofentries=2 File1=http://87.230.82.17:80 Title1=(#1 - 365/1400) Example
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I've got a string that has curly quotes in it. I'd like to replace
I am reading a book about Javascript and jQuery and using one of the
I am doing a simple coin flipping experiment for class that involves flipping a
I'm using v2.0 of ClassTextile.php, with the following call: $testimonial_text = $textile->TextileRestricted($_POST['testimonial']); ... and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.