I have a script in a Controller that I launch from the Ruby on Rails console (IRB).
This script constantly Creates-Reads-Updates (no deletions) a MySQL database, taking data from the Interwebs.
The problem is that it takes very long until all the required data is put into the database. So I would like to know if it is a good idea to simply open several Rails consoles and launch that script several times in parallel.
-> Several Ruby instances would work 1 database.
Is that a problem? Could this create any write conflicts (Create/Update) in the database? If so, is there anything I would have to do in order to avoid such conflicts?
If it’s not a problem: How many Ruby instances could I “unleash” onto the database, in parallel?
You can definitely run multiple consoles simultaneously against a single database. The limit is the number of open connections the database allows. In Mysql 5.1, the default was 100, and in 5.5 it’s 151. You’re unlikely to run out of connections before something else becomes the bottleneck.
It might just work to have multiple processes running simultaneously, but it might not. The complete analysis of this is fairly complicated. A couple things you can do to ensure it will work properly with multiple simultaneous clients. First, if you wrap each change in a database transaction that will take care of most of what you need:
Make sure your tables are using the InnoDB format instead of MyISAM which doesn’t support transactions.
Also, as mu too short points out, put all the validation constraints you can directly into the database. So if you have uniqueness constraints or foreign key relations, add them to your schema by hand, since rails doesn’t do it by default. Complex validations that compare different model objects (aside from FK relations as in
belongs_to) could require database trigger validations — hopefully you don’t need that. But if you get all your validations in the database natively, and then everything should work.