In one model I’ve got update() method which updating few fields and creates one object of some other model. The problem is that data I use to update is fetched from another host (unique for each object) and it could take a moment (host may be offline, and timeout is set to 3sec). And now, I need to update couple of hundred objects, 3-4 times per hour – of course updating every one in a row is not an option, because it could take all day.
My first thought was split it up for 50-100 threads so each one could update its own part of objects. 99% of update function time is waiting for server respond (there is few bytes of data only, so pings are the problem), I think the CPU won’t be a problem, I’m more worried about:
- Django ORM. Can it handle it? Getting all objects, splitting it up, and updating from >50 threads?
- Is it a good idea to solve this? If it is – how to do it and don’t screw a database? Or maybe I shouldn’t care about so little records?
- If it isn’t a good way, how to do it right?
You can perform actions from different thread manually (eg with
Queueand executors pool), but you should note, that Django’s ORM manages database connections in thread-local variables. So each new thread = new connection to database (which will be not good idea for 50-100 threads for one request – too many connections). On the other hand, you should check database “bandwith”.