Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7942251
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T23:54:52+00:00 2026-06-03T23:54:52+00:00

We are trying to grab the total number of input paths our MapReduce program

  • 0

We are trying to grab the total number of input paths our MapReduce program is iterating through in our mapper. We are going to use this along with a counter to format our value depending on the index. Is there an easy way to pull the total input path count from the mapper? Thanks in advance.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T23:54:54+00:00Added an answer on June 3, 2026 at 11:54 pm

    You could look through the source for FileInputFormat.getSplits() – this pulls back the configuration property for mapred.input.dir and then resolves this CSV to an array of Paths.

    These paths can still represent folders and regex’s so the next thing getSplits() does is to pass the array to a protected method org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(JobContext). This actually goes through the dirs / regex’s listed and lists the directory / regex matching files (also invoking a PathFilter if configured).

    So with this method being protected, you could create a simple ‘dummy’ extension of FileInputFormat that has a listStatus method, accepting the Mapper.Context as it’s argument, and in turn wrap a call to the FileInputFormat.listStatus method:

    public class DummyFileInputFormat extends FileInputFormat {
        public List<FileStatus> listStatus(Context mapContext) throws IOException {
            return super.listStatus(mapContext);
        }
    
        @Override
        public RecordReader createRecordReader(InputSplit split,
                TaskAttemptContext context) throws IOException,
                InterruptedException {
            // dummy input format, so this will never be called
            return null;
        }
    }
    

    EDIT: In fact it looks like FileInputFormat already does this for you, configuring a job property mapreduce.input.num.files at the end of the getSplits() method (at least in 1.0.2, probably introduced in 0.20.203)

    Here’s the JIRA ticket

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to get LINQ SQL to grab and total this data, I'm having
I'm a total newbie trying to use Python to analyze my company's log files.
Im trying to grab value from label checkbox:checked. I have created this function var
Trying to grab the two $ values and the X value from this string
I am trying to grab any digits in front of a known line number
I'm trying to grab the number of errors from a HTML log file. Here's
I am trying to grab the total amount of sales made for a a
Trying to grab the numbers in front of (h), (w) and (d). These are
Im trying to grab the destination (dynamic) of a link (static) with php Im
Im trying to grab the title on the <li> when it is clicked. It

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.