I’m using DataMapper (the ruby gem) as an ORM to a mysql database. (dm-core 1.1.0, do-mysql-adapter 1.1.0, do_mysql 0.10.6)
I’m writing an application that has two tables: a log of disk usage over time, and a “current usage” table containing foreign keys with the “latest” disk usage for easy reference. The DataMapper classes are Quota and LatestQuota, with a simple schema:
class Quota include DataMapper::Resource property :unique_id, Serial, :key => true property :percentage, Integer ... (more properties) end class LatestQuota include DataMapper::Resource belongs_to :quota, :key => true end
In my code I want to find all the entries in the LatestQuota table that correspond with a quota with a percentage higher than 95. I’m using the following datamapper query:
quotas = LatestQuota.all(:quota => {:percentage.gte => threshold})
...later...
quotas.select{|q| some_boolean_function?(q)}
Whereas some_boolean_function is something that filters out the results in a manner that DataMapper can’t know about, hence why I need to call ruby’s select().
But it ends up calling the following SQL queries (reported from DM’s debug output:)
SELECT `unique_id` FROM `quota` WHERE `percentage` >= 95
then later:
SELECT `quota_unique_id` FROM `latest_quota` WHERE `quota_unique_id` IN (52, 78, 82, 232, 313, 320…. all the unique id's from the above query...)
This is a ridiculously suboptimal query, so I think I’m doing something wrong. The quota table has millions of records in it (historical data) versus the 15k or so records in latest_quota, and selecting all quota records first and then selecting latest_quota records out of the results is exactly the wrong way to do it.
What I would like it to do is something to the effect of:
SELECT q.* from quota q INNER JOIN latest_quota lq ON lq.quota_unique_id=q.unique_id WHERE q.percentage >= 95;
Which takes .01 seconds with my current data, instead of the 5 minutes or so it takes DataMapper to do its query. Any way to coerce it to do what I want? Do I have my relations wrong? Am I querying it wrong?
For some reason nested-Hash-style queries will always perform sub-selects. To force INNER JOINs, use String query-paths: LatestQuota.all(‘quota.percentage.gte’ => threshold)