This question has two parts. The collection structure is:
_id: MongoID,
agent_id: string,
result: string,
created_on: ISO DATE,
…other fields…
Part one:
Desired Output: One result for each agent_id and result combination with a count: TUPLE representation with Equivalent SQL using PostgreSQL.
( "1234", "Success", 4 ),
( "1234", "Failure", 4 ),
( "4567", "Success", 3 ),
( "7896", "Failure", 2 ),
.....
SELECT agent_id, result, count(*)
FROM table
GROUP BY agent_id, result
HAVING created_on >= now()::date;
I have come up with the below mongo query….I think I have a conceptual or syntax error. The docs say to use $match early in the pipeline:, but although the $match limits the query when I run it by itself, as soon as I add the $group I get way to many results. Also I can’t seem to understand how to group by more than one field. How can I edit the below query to get results like the SQL query above?
db.collection.aggregate(
{ $match :
{ created_on:
{ $gte: new Date('08-13-2012') //some arbitrary date
}
}, $group:
{ _id:"$agent_id" },
$project:
{_id:0, agent_id:1, result:1}
})
Part 2)
The first result set would be adequate, but not optimal. With PostgreSQL I can achieve a result set like:
( "1234", { "Success", "Failure" }, { 4, 3 } ),
( "4567", { "Success", "Failure" }, { 3, 0 } ),
( "7896", { "Success", "Failure" }, { 0, 2 } )
I can do this in Postgresql with the array data type and a set_to_array function (custom function). The Pg specific SQL is:
SELECT agent_id, set_to_array(result), set_to_array( count(*) )
FROM table
GROUP BY agent_id, result
HAVING created_on >= now()::date;
I believe the equivalent data structure in mongodb would look like :
[
{ "1234", [ { "success": 4 }, { "failure": 4 } ] },
{ "4567", [ { "success": 3 }, { "failure": 0 } ] },
{ "7896", [ { "success": 0 }, { "failure": 0 } ] }
]
Is it possible to achieve these desired compressed results with mongodb aggregate framework ?
Here you go:
Created some test data:
Gives:
Adding part 2–pretty similar to part 1, but the counting is a bit more complicated; basically you count only if it matches what you want to count:
Gives: