I have a model- Link – that has many hits. What I’m trying to do is aggregate data on hits per hour over time from a database table – hits – with columns year, month, day, and hour.
Everything is working fine, but when I run the code below, I end up with almost 1,000 queries on the database, which seems quite excessive.
#before iteration
@hits = @link.hits.group('year, month, day, hour')
.order('year ASC, month ASC, day ASC, hour ASC')
.select('year, month, day, hour, count(*) as hits')
#inside iteration
#Line breaks are included here for easy reading
hits_per_hour =
@hits.where(
:year => t.year,
:month => t.month,
:day => t.day,
:hour => t.hour
).map(&:hits).first || 0
Here t is a Time object that iterates in steps of 1 hour since the first hit a link received.
I would think that rails would store the query result instead of re-querying every time. I also haven’t been able to find anything about caching the query results or anything. Am I just missing something completely, or is this really the easiest way?
Here is a sample of what the queries look like (this is what I see in my logs in blocks of a thousand):
SELECT year, month, day, hour, count(*) as hits
FROM `hits`
WHERE
`hits`.`link_id` = 1 AND
`hits`.`year` = 2012 AND
`hits`.`month` = 11 AND
`hits`.`day` = 2 AND
`hits`.`hour` = 14
GROUP BY year, month, day, hour
ORDER BY year ASC, month ASC, day ASC, hour ASC
Ok. So
@hitsis going to be an ActiveRecord::Relation object – a SQL query, essentially. I believe that because each iteration calls.whereon it with different parameters, and thus alters the query, Rails has decided it has to rerun the query for each hour.The simplest fix is probably to ‘collapse’ the Relation into an array before iterating, and then use pure Ruby to select the results you want each time:
and then:
I don’t think this is actually likely to be the best solution, though. Depending on exactly what data you need and what you’re doing with it, you should be able to get everything done in the database.