What HTTP response code should be used to throttle a badly behaved web crawler:
Should any explanation be returned in the headers or in the body?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Actually, the recommended(RFC6585) http status is 429 Too Many Requests. It is used, for example, on Twitter REST API Rate Limiter.
However, GSA will internally return 503 Service Unavailable if you flood it with requests, so IMO it’s a safe assumption that it also expects external sites to behave in the same manner.
I went with 503 Service Unavailable on my throttling solution.