I have a simple map/reduce job that scans one hbase table, and modifies another

Question

0

Asked: May 31, 20262026-05-31T17:40:50+00:00 2026-05-31T17:40:50+00:00

I have a simple map/reduce job that scans one hbase table, and modifies another

0

I have a simple map/reduce job that scans one hbase table, and modifies another hbase table. The hadoop job seems to complete successfully, but when I check the hbase table, the entry does not appear in there.

Here is the hadoop program:

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableMapper;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class HBaseInsertTest extends Configured implements Tool {

    @Override
    public int run(String[] args) throws Exception {
        String table = "duplicates";

        Scan scan = new Scan();
        scan.setCaching(500);
        scan.setCacheBlocks(false);

        Job job = new Job(getConf(), "HBaseInsertTest");
        job.setJarByClass(HBaseInsertTest.class);

        TableMapReduceUtil.initTableMapperJob(table, scan, Mapper.class, /* mapper output key = */null,
            /* mapper output value= */null, job);
        TableMapReduceUtil.initTableReducerJob("tablecopy", /*output table=*/null, /*reducer class=*/job);

        job.setNumReduceTasks(0);

        // Note that these are the default.
        job.setOutputFormatClass(NullOutputFormat.class);

        return job.waitForCompletion(true) ? 0 : 1;
    }

    private static class Mapper extends TableMapper<ImmutableBytesWritable, Put> {
        @Override
        protected void setup(Context context) throws IOException, InterruptedException {
            super.setup(context);
        }

        @Override
        public void map(ImmutableBytesWritable row, Result columns, Context context) throws IOException {
            long id = 1260018L;

            try {
                Put put = new Put(Bytes.toBytes(id));
                put.add(Bytes.toBytes("mapping"), Bytes.toBytes("foo"), Bytes.toBytes("bar"));
                context.write(row, put);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration config = HBaseConfiguration.create();
        int res = ToolRunner.run(config, new HBaseInsertTest(), args);
        System.exit(res);
    }
}

From HBase shell:

hbase(main):008:0> get 'tablecopy', '1260018', 'mapping'
COLUMN                          CELL                                                                                    
0 row(s) in 0.0100 seconds

I’ve simplified the program a lot to try to demonstrate/isolate the problem. I’m also relatively new to both hadoop/hbase. I did verify that mapping is a column family that exists in the tablecopy table.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T17:40:51+00:00

I think the problem was you were querying for
hbase(main):008:0> get ‘tablecopy’, ‘1260018’, ‘mapping’

instead you should have queried for this:
hbase(main):008:0> get ‘tablecopy’, 1260018, ‘mapping’

HBase was thinking it was a string key you were querying for, because of the quotations. Also if you just ran a simple client job at your end to retrieve this key from HBase, it would have gotten you the values correctly if it was already present.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a simple map/reduce job that scans one hbase table, and modifies another

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply