I’ve got millions of items ordered by a precomputed score. Each item has many boolean attributes.
Let says that there is about ten thousand possible attributes totally, each item having dozen of them.
I’d like to be able to request in realtime (few milliseconds) the top n items given ~any combination of attributes.
What solution would you recommend? I am looking for something extremely scalable.
—
– We are currently looking at mongodb and array index, do you see any limitation ?
– SolR is a possible solution but we do not need text search capabilities.
Mongodb can handle what you want, if you stored your objects like this
Then the following query will match all the items that have att1 and attr2
but this won’t match it
the query returns a cursor, if you want this cursor to be sorted, then just add the sort parameters to the query like so
Have a look at Advanced Queries to see what’s possible.
Appropriate indexes can be setup as follows
And you can get performance information using
Mongo explains how many objects were scanned, how long the operation took
and various other statistics.