Pretend I have this table, hypothetically representing users (the “voters”) rating various objects (e.g. places, websites, answers, etc., the “votees”) between one to five stars (the “score”). The more generic version of this question asks what to do for one to N foos instead.
mysql> SHOW CREATE TABLE scores;
CREATE TABLE `scores` (
`voter_id` BIGINT NOT NULL DEFAULT '0',
`votee_id` BIGINT NOT NULL DEFAULT '0',
`score` tinyint NOT NULL,
PRIMARY KEY (`voter_id`,`votee_id`),
KEY `votee_index` (`votee_id`)
);
I want to build a .csv file for each votee object, with columns representing the number of one, two, three, four, and five star votes the object has been given, e.g.
output.csv:
votee_id, count_ones, count_twos, count_threes, count_fours, count_fives
1, 3, 7, 5, 3, 2
...
I know I can get the raw data to back this table using this query:
SELECT votee_id, score, COUNT(score)
FROM scores
GROUP BY votee_id, score;
This doesn’t give me the data in the csv format I want thought:
-
it won’t list 0’s for scores not seen for a given object.
-
it doesn’t combine all five scores for an object into a single
line/row
(ie, denormalize/histogram the data).
I want to create the output using nothing but mysql.
After hacking around for a while I have the following query, which works; but it’s pretty inefficient, and I can’t find/make anything more elegant:
SELECT votee_id, GROUP_CONCAT(COALESCE(score_count, '0') ORDER BY AllUserCrossScore.score ASC)
FROM
(SELECT votee_id, score FROM
(SELECT 1 AS score UNION ALL
SELECT 2 AS score UNION ALL
SELECT 3 AS score UNION ALL
SELECT 4 AS score UNION ALL
SELECT 5 AS score) ScoresEnum
JOIN
(SELECT DISTINCT votee_id FROM scores) DistinctIds
) AllUserCrossScore
LEFT JOIN
(SELECT votee_id, score, COUNT(score) as score_count
FROM scores
GROUP BY votee_id, score
) ScoreCounts
USING (votee_id, score)
GROUP BY votee_id;
In particular, this feels especially hacky because I use GROUP_CONCAT with ‘,’ to put the scores together; and then fiddle elsewhere to have mysql join all the other fields with ‘,’ as well, resulting in the correct formatting for the .csv (fiddling not shown).
How can I do better in a case like this?
I think you are looking for something like this: