I am having trouble understanding this issue – I have a sharded cluster in

Question

0

Asked: June 11, 20262026-06-11T19:57:50+00:00 2026-06-11T19:57:50+00:00

I am having trouble understanding this issue – I have a sharded cluster in

0

I am having trouble understanding this issue – I have a sharded cluster in which one of the shards (Shard 2) seems to use the wrong index. Im querying by the shard key, which is site id and first request time { site.id: 1, frt: 1 }. I also have an index on site id and last request time.

In this query, I am also trying to limit returned documents by a couple booleans I have set in the document.

Reading the docs on how Mongo’s Query Optimizer works, this seems especially weird to me looking at the returned Explains. Docs here: Query Optimizer

I also included an explain from Shard 1 where the query returns as expected. Lastly, if I use a site id which does not have chunks stored on Shard 2, it uses the correct index, though it has nothing to scan nor return. Added explain for this to the end for completeness.

Any ideas why this would happen and/or if this is a bug?

Basic query (bad index):

shard2:PRIMARY> db.visit.find({ "site.id": 128, "frt": { $gte: new Date(2012, 8, 24 ) }, "ue": false, "bot": false }).explain()
{
    "cursor" : "BtreeCursor site.id_1_lrt_-1",
    "isMultiKey" : false,
    "n" : 198,
    "nscannedObjects" : 61204,
    "nscanned" : 61204,
    "nscannedObjectsAllPlans" : 61537,
    "nscannedAllPlans" : 61537,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 122,
    "nChunkSkips" : 0,
    "millis" : 727,
    "indexBounds" : {
        "site.id" : [
            [
                128,
                128
            ]
        ],
        "lrt" : [
            [
                {
                    "$maxElement" : 1
                },
                {
                    "$minElement" : 1
                }
            ]
        ]
    },
    "server" : "ip-10-4-211-107:2200"
}

Supplying a Hint:

shard2:PRIMARY> db.visit.find({ "site.id": 128, "frt": { $gte: new Date(2012, 8, 24 ) }, "ue": false, "bot": false }).hint("site.id_1_frt_1").explain()
{
    "cursor" : "BtreeCursor site.id_1_frt_1",
    "isMultiKey" : false,
    "n" : 198,
    "nscannedObjects" : 486,
    "nscanned" : 486,
    "nscannedObjectsAllPlans" : 486,
    "nscannedAllPlans" : 486,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 5,
    "indexBounds" : {
        "site.id" : [
            [
                128,
                128
            ]
        ],
        "frt" : [
            [
                ISODate("2012-09-24T07:00:00Z"),
                ISODate("292278995-01--2147483647T07:12:56.808Z")
            ]
        ]
    },
    "server" : "ip-10-4-211-107:2200"
}

Same query WITHOUT additional boolean constraints (uses correct Index):

shard2:PRIMARY> db.visit.find({ "site.id": 128, "frt": { $gte: new Date(2012, 8, 24 ) } }).explain()
{
    "cursor" : "BtreeCursor site.id_1_frt_1",
    "isMultiKey" : false,
    "n" : 486,
    "nscannedObjects" : 486,
    "nscanned" : 486,
    "nscannedObjectsAllPlans" : 486,
    "nscannedAllPlans" : 486,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 1,
    "indexBounds" : {
        "site.id" : [
            [
                128,
                128
            ]
        ],
        "frt" : [
            [
                ISODate("2012-09-24T07:00:00Z"),
                ISODate("292278995-01--2147483647T07:12:56.808Z")
            ]
        ]
    },
    "server" : "ip-10-4-211-107:2200"
}

On Shard 1, Original Query uses expected index:

shard1:PRIMARY> db.visit.find({ "site.id": 253, "frt": { $gte: new Date(2012, 8, 24 ) }, "ue": false, "bot": false }).explain()
{
    "cursor" : "BtreeCursor site.id_1_frt_1",
    "isMultiKey" : false,
    "n" : 15615,
    "nscannedObjects" : 15950,
    "nscanned" : 15950,
    "nscannedObjectsAllPlans" : 16152,
    "nscannedAllPlans" : 16152,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 125,
    "nChunkSkips" : 0,
    "millis" : 237,
    "indexBounds" : {
        "site.id" : [
            [
                253,
                253
            ]
        ],
        "frt" : [
            [
                ISODate("2012-09-24T07:00:00Z"),
                ISODate("292278995-01--2147483647T07:12:56.808Z")
            ]
        ]
    },
    "server" : "ip-10-6-50-253:2100"
}

Query on Shard 2 for Site with no chunks here ( Uses correct index ):

shard2:PRIMARY> db.visit.find({ "site.id": 253, "frt": { $gte: new Date(2012, 8, 24 ), "ue": false, "bot": false } }).explain()
{
    "cursor" : "BtreeCursor site.id_1_frt_1",
    "isMultiKey" : false,
    "n" : 0,
    "nscannedObjects" : 0,
    "nscanned" : 0,
    "nscannedObjectsAllPlans" : 0,
    "nscannedAllPlans" : 0,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,
    "indexBounds" : {
        "site.id" : [
            [
                253,
                253
            ]
        ],
        "frt" : [
            [
                ISODate("2012-09-24T07:00:00Z"),
                ISODate("292278995-01--2147483647T07:12:56.808Z")
            ]
        ]
    },
    "server" : "ip-10-4-211-107:2200"
}

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T19:57:52+00:00

A couple of things from the docs you link that might explain this behavior, first:

Testing of queries repeats after 1,000 operations and also after
certain manipulations of a collection occur (such as adding an index).

So, if you don’t have enough volume of queries for it to be evaluated, it will stick with its first choice.

Second:

To solve this, when testing new plans, MongoDB executes multiple query
plans in parallel. As soon as one finishes, it terminates the other
executions, and the system has learned which plan is good.

If the other index is already in memory, say because it is being used by another query, or something else is going on that slows down the query execution on the preferred index (or it is very close and occasionally they swap in terms of speed), then you will get the “bad” index being returned again.

The optimizer has been tweaked and improved in 2.2, so that may be worth a look if you continue to have problems (and are on 2.0 or below). Or, as you have already done in your testing, if you know the best index to use, just remove all doubt and use hint to specify it.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am having trouble understanding this issue – I have a sharded cluster in

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply