I have a somewhat complicated question related to MySQL. This is the table I have:
keyword args title namespace_id ratio
en 1 A 23 0.5
en 1 B 89 0.6
en 0 C 89 0.4
foo 1 Foo 23 0.7
bar 1 Bar 89 0.3
I want a list of all rows without duplicates of (keywords,args). If there are duplicates, the picked row should be selected by an order of namespace_ids I provide. The rows left should be ordered by ratio.
Example result with namespace_id order 23,89,x,y:
keyword args title namespace_id ratio
foo 1 Foo 23 0.7
en 1 A 23 0.5
en 0 C 89 0.4
bar 1 Bar 89 0.3
Example result with namespace order 89,23,x,y:
keyword args title namespace_id ratio
foo 1 Foo 23 0.7
en 1 B 89 0.6
en 0 C 89 0.4
bar 1 Bar 89 0.3
Is there any way I can to this directly in MySQL? I had a look at GROUP BY, ORDER BY and noticed even the GROUP_CONCAT() function but I didn’t manage to put it all together properly. The statement that does somehow I want is:
SELECT keyword, args, title, namespace_id, ratio
FROM tbl
GROUP BY keyword, args
ORDER BY ratio DESC;
But now I don’t know how to bring in the namespace_id order.
I’ve found similar questions like this:
MySQL: "order by" inside of "group by"
The answer there close to what I want, however, my namespace_id order is varying and cannot be calculated by the MAX() function.
EDIT: The challenge here is to tell GROUP BY which row to pick. The normal ORDER BY apparently doesn’t, it only uses the output of GROUP BY.
Try using FIELD() and a subquery:
or a self-outer-join:
Edit: It might also be worth having a look at some (commercial) DBMS supporting OLAP operations (if you’re able to choose and if you are processing large sets of data). For Georg’s case I think the OVER and PARTITION BY keywords would have helped: http://publib.boulder.ibm.com/infocenter/rbhelp/v6r3/topic/com.ibm.redbrick.doc6.3/sqlrg/sqlrg36.htm#sii06377181