Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9212119
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T01:27:09+00:00 2026-06-18T01:27:09+00:00

I am wondering if $group function in aggregation framework of MongoDB 2.2 is multiple

  • 0

I am wondering if $group function in aggregation framework of MongoDB 2.2 is multiple threaded.

For this question I did some small tests. The data set I used is used to store about 4 million emails and each email has a format as below:

shard1:PRIMARY> db.spams.findOne()
{
"IP" : "113.162.134.245",
"_id" : ObjectId("4ebe8c84466e8b1a56000028"),
"attach" : [ ],
"bot" : "Lethic",
"charset" : "iso-8859-1",
"city" : "",
"classA" : "113",
"classB" : "113.162",
"classC" : "113.162.134",
"content_type" : [ ],
"country" : "Vietnam",
"cte" : "7bit",
"date" : ISODate("2011-11-11T00:07:12Z"),
"day" : "2011-11-11",
"from_domain_a" : "domain157939.com",
"geo" : "VN",
"host" : "",
"lang" : "unknown",
"lat" : 16,
"long" : 106,
"sequenceID" : "user648",
"size" : 1060,
"smtp-mail-from_a" : "barriefrancisco@domain157939.com",
"smtp-rcpt-to_a" : "jaunn@domain555065.com",
"subject_ta" : "nxsy8",
"uri" : [ ],
"uri_domain" : [ ],
"x_p0f_detail" : "2000 SP4, XP SP1+",
"x_p0f_genre" : "Windows",
"x_p0f_signature" : "65535:105:1:48:M1402,N,N,S:."
}

I designed a query to look for all emails within one day, one week, one month, half a year and one year. Then group the result by “bot” field.

I use aggregation framework and java drive to do it. The Java code is as below:

public class RangeQuery {
final private String mongoUrl = "172.16.10.61:30000";
final private String databaseName = "test";
final private String collecName = "spams";
private DBCollection collection = null;
private DB db = null;

    public void init(){
    Mongo mongo = null;
    try {
        mongo = new Mongo(new DBAddress(mongoUrl));
    } catch (MongoException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (UnknownHostException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    db = mongo.getDB(databaseName);
    db.requestStart();
    collection = db.getCollection(collecName);
}

    public void queryRange_GroupBot(boolean printResult){
    DateFormat formatter = new SimpleDateFormat("yyyy-MM-dd'T'hh:mm:ss'Z'");
    String toDateStr [] = new String[5] ;
    toDateStr[0] = "2011-01-02T00:00:00Z";
    toDateStr[1] = "2011-01-07T00:00:00Z";
    toDateStr[2] = "2011-02-01T00:00:00Z";
    toDateStr[3] = "2011-06-01T00:00:00Z";
    toDateStr[4] = "2012-01-01T00:00:00Z";

    String toPrint [] = new String[5];
    toPrint[0] = "Within One day";
    toPrint[1] = "Within One week";
    toPrint[2] = "Within One month";
    toPrint[3] = "Within half year";
    toPrint[4] = "Within One year";

    try {
        System.out.println("\n------Query Time Range Group by Bot------");
        for(int i = 0;i < 5;i++){
            System.out.println("    ---" + toPrint[i] + "---");
            Date fromDate = formatter.parse("2011-01-01T00:00:00Z");
            Date toDate = formatter.parse(toDateStr[i]);

            DBObject groupFields = new BasicDBObject( "_id", "$bot");
            groupFields.put("sum", new BasicDBObject( "$sum", 1));
            DBObject group = new BasicDBObject("$group", groupFields);

            DBObject cond1 = new BasicDBObject();
            cond1.put("date", new BasicDBObject("$gte", fromDate));
            DBObject cond2 = new BasicDBObject();
            cond2.put("date", new BasicDBObject("$lte", toDate));
            DBObject match1 = new BasicDBObject("$match", cond1 );
            DBObject match2 = new BasicDBObject("$match", cond2 );

            for(int j = 0;j < 1;j++){
                Long runBefore = Calendar.getInstance().getTime().getTime();
                AggregationOutput aggOutput = collection.aggregate(match1, match2, group);
                Long runAfter = Calendar.getInstance().getTime().getTime();
                if(printResult){
                    System.out.println(aggOutput.getCommandResult());
                }
                System.out.println("[Query Range + Group by Bot]: " + (runAfter - runBefore) + " ms.");
            }
        }
    } catch (ParseException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

    public static void main(String[] args){
    RangeQuery rangQuery = new RangeQuery();
    rangQuery.init();
    rangQuery.queryRange_GroupBot_MapReduce(true);
  }
  }

The result looks as this:

  Within One day(2011-01-01 -> 2011-01-02)      54173 ms
  Within One week(2011-01-01 -> 2011-01-07)     54277 ms
  Within One month(2011-01-01 -> 2011-02-01)    54387 ms
  Within half year(2011-01-01 -> 2011-06-01)    53035 ms
  Within One year(2011-01-01 -> 2012-01-01)     54116 ms

What surprise me is that normally the group over one year should be slower than one day, since it contains more records. (the records in data set is uniform distributed with time)

If I just use db.spams.find({“date”:{$gt:ISODate(xxx), {$lt: xxx}}}).count, I can see that querying a year costs longer than querying a day.

But why when I use $group, this function takes nearly the same time when I enlarge the time range?

I know aggregation framework is in C++, I use mongodb 2.2, have aggregation framework used multiple threads or some other methods to improve the performance?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T01:27:11+00:00Added an answer on June 18, 2026 at 1:27 am

    According to this discussion: https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/xCSww5spXPc

    Each pipeline is currently single threaded, but you can run different
    pipelines in parallel. So if you have 100 connections each running an aggregation command, those will potentially run in parallel – but each command will run on 1 thread.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This is some kind of academic question, I just wondering if it is possible
like this question I want to pass a function with arguments. But I want
This function that I have does what I want, but just wondering if there
I want to define some Contact Groups. I am wondering where and how does
Wondering what the best / good way of doing this would be in jQuery.
I was wondering if I could clear up the console with some command.. console.log()
Only just introduced to _Underscore so this may be totally nonsense but I'm wondering
I wrote this function. The input and expected results are indicated in the docstring.
I might be asking the wrong question, but my knowledge in this area is
I'm wondering if there was a way to add a group of elements to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.