The company I work for has recently installed a Apache staging server which uses Apache’s mod_access module to prevent unwanted access to our staging environment.
One of the downsides of this is that Facebook, when trying to scrape the page for the opengraph metatags, comes up empty with the following error.
Error Scraping Page Bad response code
Which is to be expected since the scraper bumps into the authentication dialog.
My question now: is there a specific IP range that we can allow access
to the website?
We’ve looked at allowing certain headers, but that seems a little prone to header manipulation in order to bypass the security layer.
The access log did show one IP address, but I assume that Facebook uses multiple servers to scrape all these pages and I seem to remember reading that these IP addresses tend to change over time.
Any ideas?
Facebook has published their IP range here.