Consider this classical setup:
entry table:
id (int, PK)
title (varchar 255)
entry_category table:
entry_id (int)
category_id (int)
category table:
id (int, PK)
title (varchar 255)
Which basically means entries can be in one or more categories (the entry_category table is used as MM/join table)
Now I need to query 6 unique categorys along with 1 unique entries from these categories by RANDOM!
EDIT: To clarify: the purpose of this is to display 6 random categories with 1 random entry per category.
A correct result set would look like this:
category_id entry_id
10 200
20 300
30 400
40 500
50 600
60 700
This would be incorrect as there are duplicates in the category_id column:
category_id entry_id
10 300
20 300
...
And this is incorrect as there are duplicates in the member_id column:
category_id entry_id
20 300
20 400
...
How can I query this?
If I use this simple query with order by rand, the result contains duplicated rows:
select c.id, e.id
from category c
inner join entry_category ec on ec.category_id = c.id
inner join entry e on e.id = ec.entry_id
group by c.id
order by rand()
Performance is at the moment not the most important factor, but I would need a reliably working query for this, and the above is pretty much useless and does not do what I want at all.
EDIT: as an aside, the above query is no better when using select distinct ... and leaving out the group by. This includes duplicate rows as distinct only makes sure that the combinations of c.id and e.id are unique.
EDIT: one solution I found, but probably slow as hell on larger datasets:
select t1.e_id, t2.c_id
from (select e.id as e_id from entry e order by rand()) t1
inner join (select ec.entry_id as e_id, ec.category_id as c_id from entry_category ec group by e_id order by rand()) t2 on t2.e_id = t1.e_id
group by t2.c_id
order by rand()
This solution is not a piece of code I’d be proud of, since it relies on black magic of session variables in
MySQLto keep the recursion stack. However, it works.Also it’s not perfectly random and can in fact yield less than
6values (ifentity_id‘s duplicate across the categories too often). In this case, you can increase the value of15in the innermost query.Create a unique index or a
PRIMARY KEYoncategory_entity (category_id, entity_id)for this to work fast.