On the front-end, I have a PHP webapp that allows users to create a list of their websites (5 max).
On the back-end, a Python script runs daily (and has ~10 iterations) for each website that the user registers. Each script per website takes about 10 seconds to run through all iterations and finish its scraping. It then makes a CSV file with its findings.
So, in total, that’s up to (5 websites * 10 iterations =) 50 iterations at 8.3 total minutes per user.
Right now, the script works when I manually feed it a URL, so I’m wondering how to make it dynamically part of the webapp.
- How do I programmatically add and remove scripts that run daily depending on the number of users and the websites each user has each day?
- How would I schedule this script to run for each website of each user, passing in the appropriate parameters?
I’m somewhat acquainted with cronjobs, as it’s the only thing I know of that is made for routine processes.
You can make the PHP app place the URLs into a database (MySQL, Sqlite, etc.) or text file. Then, loop through the database/text file in your Python script. Use Cron to run the Python script every day.
There are lots of resources for learning the Cron syntax:
http://google.com/search?q=cron+tutorial