From what I understand, CouchDB’s Btree implementation actually uses Shadowing technique, and every update will produce new root, the following excerpts from this PDF (it looks like implementing a better algorithm than traditional shadowing).
Shadowing means that to update an
on-disk page, the entire page is read
into memory, modified, and later
written to disk at an alternate
location. When a page is shadowed its
location on disk changes, this creates
a need to update (and shadow) the
immediate ancestor of the page with
the new address. Shadowing propagates
up to the file system root.
How does couchdb implement fetching all leaf revisions as possible as it can( since some revisions are removed by compacting process)? Does couch internally store a pointer which points to previous revisions?
Thanks
Chang
By good luck, CouchDB committer and community leader Adam Kocoloski explained this recently on the mailing list.
Here is what he said:
“Each leaf in the ID btree [stores] a revision tree containing pointers
to all available revisions of a document. Retrieving an old revision (before compaction)
or a conflicting version of a document requires exactly the same number of IOs as retrieving
the current one.”
If I understand correctly, shadowing is not used to conceal old document revisions at all, but rather entire revision trees that are no longer meaningful.