This may be a silly question – but I couldn’t find out how to “index” the row key in Hbase so I am assuming that when HBase puts in the row key they have built-in support to automatically index the table based on the row key – in other words, treating the row key as primary key automatically?
thanks!
The table is not just indexed by the key it is actually lexicographically ordered by the key. i.e. Hbase knows on which region service to find each key and within that regionserver the region and the sepecific HFile. The data that is written to the HFile is ordered by the key.
The lexicographic ordering means you can also retrive data by partial key (e.g. a scan for “a”) will get everything that starts with “a”. This is used a lot of time to put multiple dimensions in the key e.g. you can have the key set to country followed by city to get aggregates per country and then get a breakdown by city efficiently.