I wrote a script downloading a list of pages from a website. From time to time I receive the following error (the number of seconds is variable):
The bwshare module will refuse your requests for the next 7 seconds.
You have downloaded data too rapidly.
I found when using sleep(2) in the loop, it works much better, however the time delay is too expensive.
What’s the best way how to deal with this module? Should I scrape it without any delay and if the response will be similar to the above message simply use sleep for the requested number of seconds?
It all depends on how many pages you can get before the error message.
Try and measure how many pages in average you can get.
4 pages before the bwshare message is the minimum.
If you are getting the error message before reaching 4 page downloads, then il would be faster to sleep(2) after each download.