Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7488351
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T14:50:41+00:00 2026-05-29T14:50:41+00:00

I have an interesting problem. I have a working M/R version of this but

  • 0

I have an interesting problem. I have a working M/R version of this but it’s not really a viable solution in a small-scale environment since it’s too slow and the query needs to be executed real-time.

I would like to iterate over each element in a collection and score it, sort by descending, limit to top 10 and return the results to the applications.

Here is the function I’d like applied to each document in pseudo code.

var score = 0;
foreach(tag in document.Tags) {
    score += someMap[tag];
}
return score;
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T14:50:44+00:00Added an answer on May 29, 2026 at 2:50 pm

    Since your someMap is changing each time, I don’t see any alternative other than to score all the documents and return the highest-scoring ones. Whatever method you adopt for this type of operation, you’ll have to consider all the documents in the collection, which is going to be slow, and will become more and more costly as the collection you’re scanning grows.

    One issue with map reduce is that each mongod instance can only run one concurrent map reduce. This is a limitation of the javascript engine, which is single-threaded. Multiple map reduces will be interleaved, but they cannot run concurrently with one another. This means that if you’re relying on map reduce for “real-time” uses, that is, if your web page has to run a map reduce to render, you’ll eventually hit a limit where page load times become unacceptably slow.

    You can work around this by querying all the documents into your application, and doing the scoring, sorting, and limiting in your application code. Queries in MongoDB can run concurrently, unlike map reduce, though of course this means that your application servers will have to do a lot of work.

    Finally, if you are willing to wait for MongoDB 2.2 to be released (which should be within a few months), you can use the new aggregation framework in place of map reduce. You’ll have to massage the someMap to generate the correct pipeline steps. Here’s an example of what this might look like if someMap were {"a": 5, "b": 2}:

    db.runCommand({aggregate: "foo",
        pipeline: [
            {$unwind: "$tags"},
            {$project: {
                tag1score: {$cond: [{$eq: ["$tags", "a"]}, 5, 0]},
                tag2score: {$cond: [{$eq: ["$tags", "b"]}, 3, 0]}}
            },
            {$project: {score: {$add: ["$tag1score", "$tag2score"]}}},
            {$group: {_id: "$_id", score: {$sum: "$score"}}},
            {$sort: {score: -1}},
            {$limit: 10}
        ]})
    

    This is a little complicated, and bears explaining:

    1. First, we “unwind” the tags array, so that the following steps in the pipeline process documents where “tags” is a scalar — the value of the tag from the array — and all the other document fields (notably _id) are duplicated for each unwound element.
    2. We use a projection operator to convert from tags to named score fields. The $cond/$eq expression for each roughly means (for the tag1score example) “if the value in the document in the ‘tags’ field id equal to ‘a’, then return 5 and assign that value to a new field tag1score, else return 0 and assign that”. This expression would be repeated for each tag/score combination in your someMap. At this point in the pipeline, each document will nave N tagNscore fields, but at most one of them will have a non-zero value.
    3. Next we use another projection operator to create a score field whose value is the sum of the tagNscore fields in the document.
    4. Next we group the documents by their _id, and sum up the value of the score field from the previous step across all documents in each group.
    5. We sort by score, descending (i.e. greatest scores first)
    6. We limit to only the top 10 scores.

    I’ll leave it as an exercise to the reader how to convert someMap into the correct set of projections in step 2, and the correct set of fields to add in step 3.

    This is essentially the same set of steps that your application code or map reduce would go through, but has the following distinct advantages: instead of map reduce, the aggregation framework is fully implemented in C++ and is faster and more concurrent than map reduce; and unlike querying all the documents to your application, the aggregation framework works with the data on the server side, saving network load. But like the other two approaches, this will still have to consider each document, and can only limit the result set once the score has been calculated for all of them.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have really interesting problem, but I am solving it for 3 hours and
So I have an interesting design problem. I am working on SLES 9+ Linux,
I have experienced a frustrating and interesting problem. The design I am working on
Today I'm working with an interesting problem. Right now, I have a first view
So, interesting problem I've run into. I'm sure there's an easy solution, but I'm
I have an interesting problem. I am working on an embedded box with multiple
I have an interesting problem that I need help with. I am currently working
I have a quite interesting problem that's making my head twist. I'm working on
I have an interesting source control workflow problem. My company is working with a
We have an interesting problem with WCF binding and streaming transfer mode that we

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.