I’m using HBase through Java API to manage a list of URLs and parameters

Question

0

Asked: June 13, 20262026-06-13T14:30:07+00:00 2026-06-13T14:30:07+00:00

I’m using HBase through Java API to manage a list of URLs and parameters

0

I’m using HBase through Java API to manage a list of URLs and parameters waiting to be analyzed by a scraper with multiple threads. The program also continuously adds new rows to the table.

I need to continuously read one and only one row from a table and delete it, atomically (a row mustn’t be read by two threads at the same time), without criteria to select one row instead than another.

Currently, I create a Scanner and retrieve the first result:

Scan s = new Scan();
ss = t.getScanner(s);
for(Result r:ss){
  String ris=Bytes.toString(r.getRow())+Bytes.toString(r.getValue(Bytes.toBytes("TTL"),Bytes.toBytes("value")));
  //delete the retrieved row
  t.delete(new Delete(r.getRow()));
  ss.close();
  //return at the first iteration, after closing the Scanner
  return ris;
}

But the program is ultra-slow (that is, 10 or more seconds to get a value), and I have no idea about how to do it nicely. How can I “consume” a single row in HBase?

EDIT: I forgot to mention that the table has a single column family with a single column

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T14:30:09+00:00

Editorial Team

2026-06-13T14:30:09+00:00Added an answer on June 13, 2026 at 2:30 pm

You can try to use scan with filters. I think that PageFilter is suitable for your purposes.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using HBase through Java API to manage a list of URLs and parameters

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply