I’m stuck with this simple prefix query. Although Mongo docs state that you can get pretty good performance by using the prefix regex format (/^a/), the query is pretty slow when I try to sort the results:
940 millis
db.posts.find({hashtags: /^noticias/ }).limit(15).sort({rank : -1}).hint(‘hashtags_1_rank_-1’).explain()
{
"cursor" : "BtreeCursor hashtags_1_rank_-1 multi",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 142691,
"nscanned" : 142692,
"nscannedObjectsAllPlans" : 142691,
"nscannedAllPlans" : 142692,
"scanAndOrder" : true,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 934,
"indexBounds" : {
"hashtags" : [
[
"noticias",
"noticiat"
],
[
/^noticias/,
/^noticias/
]
],
"rank" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
},
"server" : "XRTZ048.local:27017"
}
However, the unsorted version of the same query is super fast:
0 millis
db.posts.find({hashtags: /^noticias/ }).limit(15).hint(‘hashtags_1_rank_-1’).explain()
{
"cursor" : "BtreeCursor hashtags_1_rank_-1 multi",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 15,
"nscanned" : 15,
"nscannedObjectsAllPlans" : 15,
"nscannedAllPlans" : 15,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"hashtags" : [
[
"noticias",
"noticiat"
],
[
/^noticias/,
/^noticias/
]
],
"rank" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
},
"server" : "XRTZ048.local:27017"
}
The query is also fast if I remove the regex and sort:
0 millis
db.posts.find({hashtags: ‘noticias’ }).limit(15).sort({rank : -1}).hint(‘hashtags_1_rank_-1’).explain()
{
"cursor" : "BtreeCursor hashtags_1_rank_-1",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 15,
"nscanned" : 15,
"nscannedObjectsAllPlans" : 15,
"nscannedAllPlans" : 15,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"hashtags" : [
[
"noticias",
"noticias"
]
],
"rank" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
},
"server" : "XRTZ048.local:27017"
}
It seems like using both regex and sort makes Mongo scan lots of records. However, sort is scanning just 15 if I don’t use the regex. What’s wrong here?
The
scanAndOrder: truein the explain output indicates that the query is having to retrieve the documents and then sort them in memory before the output is returned. This is an expensive operation, and will be having an impact on the performance of your query.The existence of
scanAndOrder: trueas well as the difference innscannedannin the explain output indicates that the query is not using an optimal index. In this case it appears to be needing to do a collection scan. You might be able to alleviate this issue by including the index keys in yoursortcriteria. From my testing:Does not require a scan and order, and returns
nandnscannedof the number of records you are looking for. This would also mean sorting on thehashtagskey, which may or may not be useful to you, but should increase the performance of the query.