A while ago I found a handy query for mysql to get the top X per group by.
This is what I mean:
if this is the table:
rid id1 id2 id3 value 1 1 1 1 10 2 1 1 2 11 3 1 1 3 9 4 1 2 1 20 5 1 2 2 18 6 1 2 3 23 7 1 3 1 30 8 1 3 2 34 9 1 3 3 31 10 1 3 4 27 11 1 3 5 32 12 1 4 1 41 13 1 4 2 40 14 1 4 3 43 15 1 5 1 53 16 1 5 2 51 17 1 5 3 50 18 2 1 1 11 19 2 1 2 9 20 2 1 3 12
I want to get this result:
rid id1 id2 id3 value 2 1 1 2 11 6 1 2 3 23 8 1 3 2 34 14 1 4 3 43 15 1 5 1 53
I can get this by running the following mysql query:
SELECT * FROM
(SELECT * FROM idsandvalue
WHERE id1=1 AND
(SELECT COUNT(*) FROM idsandvalue AS helper
WHERE helper.id1 = idsandvalue.id1
AND helper.id2= idsandvalue.id2
AND helper.value > idsandvalue.value
) < 1
)a;
if I change < 1 to lets say 2, 3 or x I can get the top x per id2 where id1=1 (so, two of the same id2’s with different id3’s) like this:
rid id1 id2 id3 value
1 1 1 1 10
2 1 1 2 11
4 1 2 1 20
6 1 2 3 23
8 1 3 2 34
11 1 3 5 32
12 1 4 1 41
14 1 4 3 43
15 1 5 1 53
16 1 5 2 51
two questions.
A) the query is not really fast in MySQL. Takes a while (runs a table with 3207394 rows). Can I get the same result with the use of a different query (I was not able to get it).
B) How can I translate this to linq? Due to the strange where statement, I have no clue how to translate this into linq.
(later I added this extra question as well)
in MySQL I use this query:
SELECT *,COUNT(*) AS Counter FROM idsandvalue GROUP BY id1,id2;
to get this result:
rid id1 id2 id3 value Counter 1 1 1 1 10 3 4 1 2 1 20 3 7 1 3 1 30 5 12 1 4 1 41 3 15 1 5 1 53 3 18 2 1 1 11 3
I’m also having difficulties translating this to Linq.
(extra info was too big for comment)
Hi John (thanks for the quick respond).
with this mysql query
SELECT * FROM
(SELECT * FROM idsandvalue
WHERE id1=1 AND
(SELECT COUNT(*) FROM idsandvalue AS helper
WHERE helper.id1 = idsandvalue.id1
AND helper.id2= idsandvalue.id2
AND helper.value > idsandvalue.value
) < 1
)a
I try to get the rows for each grouped id1 and id2 with it’s biggest value. That’s why in this case I get for instance row with id 2. 11 is the biggest of 10,11 and 9 where id1=1 and id2=1. and that’s why I get the row with id 8, because where id1=1 and id2=3 the biggest value for column value is 34. If I change the query to < 2, I get the top two. for id2=1 and id2=3 this would give the rows with id 8 and 11. Is this better explained?
Recreated your table in SQL Server and ran your query against it, than converted the query via linqer:
select new {
a.idsandvalue.rid,
a.idsandvalue.id1,
a.idsandvalue.id2,
a.idsandvalue.id3,
a.idsandvalue.value
}