Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 994339
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T06:33:34+00:00 2026-05-16T06:33:34+00:00

Please tell me how HBase partitions table across regionservers. For example, let’s say my

  • 0

Please tell me how HBase partitions table across regionservers.

For example, let’s say my row keys are integers from 0 to 10M and I have 10 regionservers.
Does this mean that first regionserver will store all rows with keys with values 0 – 10M, second 1M – 2M, third 2M-3M , … tenth 9M – 10M ?

I would like my row key to be timestamp, but I case most queries would apply to latest dates, all queries would be processed by only one regionserver, is it true?

Or maybe this data would be spread differently?
Or maybe can I somehow create more regions than I have region servers, so (according to given example) server 1 would have keys 0 – 0,5M and 3M – 3,5M, this way my data would be spread more equally, is this possible?


update

I just found that there’s option hbase.hregion.max.filesize, do you think this will solve my problem?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T06:33:35+00:00Added an answer on May 16, 2026 at 6:33 am

    WRT partitionning, you can read Lars’ blog post on HBase’s architecture or Google’s Bigtable paper which HBase “clones”.

    If your row key is only a timestamp, then yes the region with the biggest keys will always be hit with new requests (since a region is only served by a single region server).

    Do you want to use timestamps in order to do short scans? If so, consider salting your keys (search google for how Mozilla did it with Sorocco).

    Can your prefix the timestamp with any ID? For example, if you only request data for specific users, then prefix the ts with that user ID and it will give you a much better load distribution.

    If not, then use UUIDs or anything else that will randomly distribute your keys.

    About hbase.hregion.maxfilesize

    Setting the maxfilesize on that table (which you can do with the shell), doesn’t make it that each region is exactly X MB (where X is the value you set) big. So let’s say your row keys are all timestamps, which means that each new row key is bigger than the previous one. This means that it will always be inserted in the region with the empty end key (the last one). At some point, one of the files will grow bigger than maxfilesize (through compactions), and that region will be split around the middle. The lower keys will be in their own region, the higher keys in another one. But since your new row key is always bigger than the previous, this means that you will only write to that new region (and so on).

    tl;dr even though you have more than 1,000 regions, with this schema the region with the biggest row keys will always get the writes, which means that the hosting region server will become a bottleneck.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

please tell me how to write this query i have an access table number
Please tell me how to connect to mysql database server from another mysql database
Please tell me, Is Jetty non-blocking web server by default or not? For example,
Please tell me from experience with using the IN clause in a MySQL query
please tell me the html syntax with example so that when i create a
please tell me any good algorithm/code to get list of unique values from array
Please tell me what is the Qt equivalent function for glutswapbuffers()..
Please tell me why my window doesn't render. Below is the javascript that i
Please tell me if it is possible to do the following: create an instance
Please tell me what will the call to given function return and how? The

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.