I’ve got a large web app which writes many millions of rows into partitioned tables in PostgreSQL each day (meaning there’s a new table for each day’s data).
We’re using PostgreSQL’s table inheritance and partitioning to speed things along:
Due to there being year’s worth of data in our DB we can’t effectively use insert triggers to route the content to the correct table (the functions are getting very, very long in length).
Long story short, we need ActiveRecord to know which table to insert and update the data on. BUT, not change the table that is used for selects and other DB tasks.
Obviously it’s simple to define the table name for a model, but is it possible to override the table name for just particular actions?
Here’s a little more detail:
Database:
- Table: dashboard.impressions (id, host, data, created_on, etc)
- Table: data.impressions_20120801 (inherited from dashboard.impressions, with a constraint of created_on being equal to the tables date)
Impression.create :host=>"localhost", :data=>"{...}", created_on=>DateTime.now should write to the data.impressions_20120801 table, where Impression.where(:host=>"localhost") should search on the dashboard.impressions table, since that contains all the data.
Edit: I’m running PostgreSQL 9.1 and Rails 3.2.6
I don’t do Rails so I can’t help with the ActiveRecord side, but I can offer a pure Pg fallback solution for if you can’t get ActiveRecord to do what you want. It’ll cost you a little bit of insert performance so it’ll be much better to teach ActiveRecord to do the inserts to the right place.
Personally I’d just do the
INSERTs directly via thepggem and bypass ActiveRecord completely. If you can’t do that, or ActiveRecord does caching that means you shouldn’t, try this alternate partitioning trigger implementation.Instead of explicitly listing every partition in your trigger function, consider
EXECUTE ... USINGfor insertion, and generate the partition name using your naming scheme. Something like the untested: