Collection events has userId and an array of events– each element in the array is an embedded document. Example:
{
"_id" : ObjectId("4f8f48cf5f0d23945a4068ca"),
"events" : [
{
"eventType" : "profile-updated",
"eventId" : "247266",
"eventDate" : ISODate("1938-04-27T23:05:51.451Z"),
},
{
"eventType" : "login",
"eventId" : "64531",
"eventDate" : ISODate("1948-05-15T23:11:37.413Z"),
}
],
"userId" : "junit-19568842",
}
Using a query like the one below tofind events generated in last 30 days:
db.events.find( { events : { $elemMatch: { "eventId" : 201,
"eventDate" : {$gt : new Date(1231657163876) } } } } ).explain()
Query plan shows that index on “events.eventDate” is used when the test data contains fewer events (around 20):
{
"cursor" : "BtreeCursor events.eventDate_1",
"nscanned" : 0,
"nscannedObjects" : 0,
"n" : 0,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : true,
"indexOnly" : false,
"indexBounds" : {
"events.eventDate" : [
[
ISODate("2009-01-11T06:59:23.876Z"),
ISODate("292278995-01--2147483647T07:12:56.808Z")
]
]
}
}
However, when there are large number of events (around 500), index is not used:
{
"cursor" : "BasicCursor",
"nscanned" : 4,
"nscannedObjects" : 4,
"n" : 0,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
}
}
Why is the index not being used when there are a lot of events? May be when
there are large number of events, MongoDB finds it is efficient just to scan all the items than using the index?
MongoDB’s query optimizer works in a special way. Rather than calculating cost of certain query plan, it just launches all available plans. Whichever returns first is considered optimal one and will be used in the future.
Application grows, data grows and changes, optimal plan may become not optimal at some point. So, mongo repeats that query selection process every once in a while.
It appears that in this concrete case, basic scan was the most efficient.
Link: http://www.mongodb.org/display/DOCS/Query+Optimizer