I’ve got a little time-tracking web app (implemented in Rails 3.2.8 & MySQL). The app has several users who add their time to specific tasks, on a given date. The system is set up so a user can only have 1 time entry (i.e. row) per task per date. I.e. if you add time twice on the same task and date, it’ll add time to the existing row, rather than create a new one. The user_id/task_id/date uniqueness is enforced by a UNIQUE index.
Now I’m looking to merge 2 tasks. In the simplest terms, merging task ID 2 into task ID 1 would take this
time | user_id | task_id | date
------+----------+----------+-----------
10 | 1 | 1 | 2012-10-29
15 | 2 | 1 | 2012-10-29
10 | 1 | 2 | 2012-10-29
5 | 3 | 2 | 2012-10-29
and change it into this
time | user_id | task_id | date
------+----------+----------+-----------
20 | 1 | 1 | 2012-10-29 <-- time values merged (summed)
15 | 2 | 1 | 2012-10-29 <-- no change
5 | 3 | 1 | 2012-10-29 <-- task_id changed (no merging necessary)
I.e. merge by summing the time values, where the given user_id/date/task combo would conflict.
I figure I can use a unique constraint to do a ON DUPLICATE KEY UPDATE ... if I do an insert for every task_id=2 entry. But that seems pretty inelegant.
I’ve also tried to figure a way to first update all the rows in task 1 with the summed-up times, but I can’t quite figure that one out.
Any ideas?
Update: Going off of Olaf’s answer below, I came up with this, which seems to be working
INSERT INTO `timetable`
(`time`, `user_id`, `task_id`, `date`)
(
SELECT
SUM(`time`) AS `time`,
`user_id`,
1 AS `task_id`,
`date`
FROM `timetable` AS `t1`
WHERE `task_id` IN (1,2)
GROUP BY `user_id`, `date`
)
ON DUPLICATE KEY UPDATE `time`=VALUES(`time`);
DELETE FROM `timetable` WHERE `task_id`=2;
I’ll leave the question open in case someone has a better solution (or if there are any gotchas in my solution that I should know about)
Update 2: Don’t know why I didn’t realize this earlier, but my solution may do a lot of redundant INSERTs, because it also finds all the entries that only exist in the target task and don’t need merging. In the above example data, the 2nd row will be found, re-inserted, trigger an on-duplicate-key, and set the time to the same as it already is. So if the target task has, say, 10 rows, and the source task 0 rows, it’ll still do 10 (completely pointless) INSERTs.
This can be avoided by wrapping the inner SELECT everything in yet another SELECT, and use COUNT(*) to only find those rows that need merging. Of course, this will necessitate another more queries to update the task_id of those rows that don’t need merging (which requires a join to figure out, as far as I can tell).
Summing on
user_idanddatewould be:Updating the time table (kudos to @Flambino):
And finally remove all rows with
task_idnot 1:When updates are restricted to tasks 1 and 2, apply where clause like in @Flambino’s update.