I always struggle with dealing with normalised data, and how I display it. Maybe its because I don’t fully understand the normalisation rules, like how to get it fully into Boyce-Codd. Performance is not really an issue at this stage, though maintainability of the schema is.
user
ID Name
1 Alice
2 Bob
3 Charlie
skills
ID Name
1 Karate
2 Marksmen
3 Cook
event
ID Name
1 Island
2 Volcano
user-m2m-skill
MemberID SkillID
1 1
1 2
2 1
2 3
3 1
user-m2m-event
MemberID EventID
1 1
1 2
2 1
3 2
How do I get this information out of the database? I’d like to display a table like this, where I’ve got the total count of each skill:
Skills at event
Event Karate Marksmen Cook
Island 2 1 1
Volcano 2 1 0
It is unlikely that the skills table will change very much. This means I could do a set of subqueries like this (obviously shortened and incorrect syntax)
SELECT event.name,
(SELECT COUNT(*) FROM ... WHERE skill = 'Karate'),
(SELECT COUNT(*) FROM ... WHERE skill = 'Marksmen') FROM event
And that’s what I’ve been doing, putting it into a view. But its a bit horrible, no? I have to edit the view every time I add a new skill.
The other way to to process it client side. So I just get back something like this:
Event Skill Count
Island Karate 2
Island Marksmen 1
Island Cook 1
Volcano Karate 2
Volcano Marksmen 1
And I loop through the results, reformatting it. But I hate that even more. Isn’t the database supposed to do data?
So: what am I doing wrong? Am I expecting too much? Which is the lesser evil?
(As b3ta would say, apologies for length of post and for bad markup. 🙁 )
This is a typical pivot query, because you are looking to convert data in rows into columns.
Followup question:
SELECTs as statements within the SELECT clause. IE:
…means that a separate query is run for every skill. If the queries were correllated – if they were tied together by an ID to make sure records sync’d with an event, then the count would be running for every event.
Conclusion to Followup
The approach is terribly inefficient. It is better to fetch the necessary values once, as I provided in my answer.
Addendum
Regarding updating the query – it is possible to minimize the maintenance by implementing the query with dynamic SQL.