How can it maintain an ordered index if HDFS is read-only (or appendable in the base scenario) ?
Does it store its indexes in HDFS or some permanent store?
[EDIT] For exemple purpose, let’s say I have added the rows F B A E in this order. Since HDFS can only append I suspect the order of the rows on the disk will be the same as the insert order. But how does it maintain its index or keep its keys ordered? – since the area where it stored the keys is write-once.
HBase doesn’t have indexes. It has ordered keys (roughly equivalent to a clustered index in SQL Server, or index-organized tables in Oracle, but without the b-tree), which are maintained using ordered partitioning and timestamped writes.
HFiles are flushed to disk only when the memstore reaches a certain (configurable) size, and you are right- they are only written once, and not modified thereafter. When HBase runs a compaction, multiple files are read off disk, combined, and then re-written as one larger (combined) file. Then the smaller files are deleted.
In the meantime, the Write Ahead Log (WAL) is written to HDFS periodically (10s by default), and contains the ordered set of edits for a given regionserver. I believe that the WAL requires HDFS Append to work properly.
All of this and a lot more info @ my presentation on HBase here: http://www.slideshare.net/trihug/intro-to-apache-hbase-by-chris-shain-of-tresata and here http://outerthought.org/blog/465-ot.html and here http://outerthought.org/blog/417-ot.html