Apologies if this has been covered before – I did my searching but possibly may not know the correct terms to have used.
This process is handled with PHP.
Here’s the situation:
I have a large array of file names. The script I have opens these files and enters their content into a database. Processing these files one at a time takes over 24 hours, and these files are updated on a daily basis.
Breaking the single large array into four smaller arrays and running concurrent processes finishes the job before the 24 hour window elapses, but sometimes one or two processes will finish hours before the others because file sizes vary on a daily basis.
Much like people who stock retail shelves (who else has worked that nightmare before?) pitch in to help out with what’s left after finishing their own tasks, I’d like to have a script in place where these “agents” do the same.
Here’s some basics of what I have figured out – it could be wrong, and I’m not too proud to protest if I am 🙂
$files = array('file1','file2','file3','file4','file5');
//etc... on to over 4k elements
while($file = array_pop($files)){
//Something in here... I have no idea what.
}
Ideas? Something like four function calls or four loops within that overarching ‘while’ has crossed my mind, but I’m pretty sure it’s going to wait on executing subsequent calls until the previous one(s) finish.
Any help is appreciated. I’m seriously stuck on this one!
Thanks!
A database-backed message queue seems the obvious solution but I think that’s overkill in this case. I would simply put the files to be processed into a single dedicated queue directory, then use the DirectoryIterator class to scan it. Something like this:
Edit:
Regarding launching the workers, you could use a simple shell script to spawn the PHP processes in the background:
Then, create a cron entry to run this launcher, for example, at midnight: