I was reading the documentation for Mongo Shard keys for 2.2, and found it a bit confusing.
All sharded collections must have an index that starts with the shard
key. If you shard a collection that does not yet contain documents and
without such an index, the shardCollection will create an index on the
shard key. If the collection already contains documents, you must
create an appropriate index before using shardCollection.Changed in version 2.2: The index on the shard key no longer needs to
be identical to the shard key. This index can be an index of the shard
key itself as before, or a compound index where the shard key is the
prefix of the index. This index cannot be a multikey index.If you have a collection named people, sharded using the field {
zipcode: 1 }, and you want to replace this with an index on the field
{ zipcode: 1, username: 1 }, then:Create an index on { zipcode: 1, username: 1 }: db.people.ensureIndex(
{ zipcode: 1, username: 1 } ); When MongoDB finishes building the
index, you can safely drop existing index on { zipcode: 1 }:
db.people.dropIndex( { zipcode: 1 } ); Warning The index on the shard
key cannot be a multikey index. As above, an index on { zipcode: 1,
username: 1 } can only replace an index on zipcode if there are no
array values for the username field.If you drop the last appropriate index for the shard key, recover by
recreating a index on just the shard key.
I have a couple of questions about shard keys and indexes.
i) From the documentation, it looks like multi-key indexes were supported before 2.2. If that is the case, how is a compound index different from multikey indexes ?
ii) What is the difference between having
[a] an index that starts with a shard key and
[b] an index which has a shard key as a prefix ?
iii) What is the warning note about an index on the shard key should not be a multikey index ?
Isn’t db.people.ensureIndex( { zipcode: 1, username: 1 } a multikey index ?
How is a compound index different from a multikey index?:
A compound index is an index like the one you described in the example:
{ zipcode: 1, username: 1 }A multikey index is one that indexes items in an array, like an index on
tagsthat is used to return all documents that contain the tag ‘mongoDB’,What is the difference between having [a] an index that starts with a shard key and [b] an index which has a shard key as a prefix?:
Nothing.
What is the warning note about an index on the shard key should not be a multikey index?:
This makes a fair bit of sense when you consider that a multikey index is an index on an array. Consider our index on a tags array. A document could easily live in many (or all) shards, if it had the right collection of values in the array.
In other words, documents still have to be sharded based on a single value, as opposed to an object or an array.