I need some help in thinking through the process to do batch update on multiple tables for RoR application. Following are my models,
class User < ActiveRecord::Base
has_many :addresses
has_many :phones
end
class Address < ActiveRecord::Base
belongs_to :user
has_one :addresstype
end
class Phone < ActiveRecord::Base
belongs_to :user
has_one :phonetype
end
class PhoneType < ActiveRecord::Base
belongs_to :phone
end
class AddressType < ActiveRecord::Base
belongs_to :address
end
You can imagine “Address” table has user_id, PhoneType has phone_id, and AddressType has address_id keys to maintain associations.
So, I want to process some files with user records to insert into appropriate tables. For e.g.
...
usr1@foo.com,1234 sw main st. ca 19820,offce,425-378-1188,mobile
usr1@foo.com,7869 sw fool st. ca 19820,residential,425-898-2345,landline
usr2@foo.com,4321 sw oak st. ca 19822,offce,435-378-1298,mobile
usr3@foo.com,8789 sw adler st. ca 19822,residential,436-898-6234,landline
...
millions of them either all in one file or one record per file transfered from remote server.
OR, is there any other way to process these remote requests on demand basis? For e.g. remote servers send a record to my RoR application and it gets processed thru RoR app.
In both cases I want to make sure data to be inserted passes all validation rules. Like email format is valid, or address can not be empty.
These records could be in json format to save on size of data to be transfered in a file.
While processing user (usr1@foo.com) may or may no exists.
Thanks and I really appreciate any help.
Atarang.
You can do both. If you have one file you can simply create a ruby script and include
config/environment.rbto load the environment of the rails application in your ruby script. Then you can load the file and deal with each line like this:If its in CSV format you can also use rubys CSV Librarie (http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html)
On the other hand you could also create an interface that parses the data in JSON. You could use the JSON implementation for ruby (which is documentated here: http://flori.github.com/json/) to create an action that parses a single object thats transfered in JSON.
//About the validations:
Write validations in the Models! Then you check all the Models that
belongs_tosth. because you may already have thePhoneTypethats used in the actual record. So care about not getting duplicaties when you dont need them.After that you use the
newmethod to create new objects (notcreatebecause create writes to the database). Then you parse the line and when the line is parsed you can check all your objects using thevalid?method which checks if the object passes all the validations defined in the model.And only if all your objects pass the validation you safe them to the database. Otherway you simply dont execute the
savemethod so datasets which dont pass the vaidations arent stored in the databse.//About th eperformende
If you just have to perform the import once its not that big problem if the script isnt that perfect. What you should do when there are many entries in the table (100 arent many, 100000 are,..) you should add indexes to all the foreign key columns. You can create a migration
addIndexesToForeignKeysand add indexes with theadd_indexmethod like this:This will speed up select queries and slow down insert queries a bit but I think show actions will be called more often then create actions 😉