I’m implementing a statistical algorithm that needs access to a large sample dataset for proper testing. Large being 50,000 rows in a single table, MySQL.
I would like to use traditional RSpec methods to test, but creating the sample set and loading it into the DB leads to two problems.
- Extremely slow/intensive using Active Record create. I haven’t explored various options to create to skip validation, since the models are pretty basic and I assume it won’t make a huge speed difference
- Improper cleanup using a hacky
mysqlimport(meaning data left in the database after test, despite an explicit call to DatabaseCleaner in an :after block)
Creating the object graph in-memory is a possibility, but not being a mockist I’m a little afraid to override AR functionality.
Any ideas, best practices?
Thanks!
Justin
It’s only a partial answer, but:
It actually is a big speed difference. PostgreSQL has a good guide on this:
http://www.postgresql.org/docs/9.0/interactive/populate.html
Most it applies to MySQL directly:
If you want to flush your tables of all their data, try truncate:
http://dev.mysql.com/doc/refman/5.5/en/truncate-table.html