I am trying to build a script that posts information into the RoyalMail tracking system and extracts the output.
What I currently have is getting an error from their server – see the link, somehow it is detecting that I am not using their website as per normal and throwing me an error.
Things I think I have taken into account:
- Using an exact copy of their form by parsing it beforehand (the post parameters)
- Saving the cookies between each request
- Accepting redirect headers
- Providing a refer header that is actually valid (the previously visited page)
Does anyone know anything else I need to check or can figure out what I am doing wrong?
A full copy of the source is at EDIT: please see my answer below
I have now fixed it, the problem was with PHP curl and following redirects, it seems that it doesn’t always post the request data and sends a GET request when following.
To deal with this I disabled curl follow location with
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);and then built a follow location system myself that works recursively. Essentially it extracts the location header from the response, checks for a 301 or a 302 and then runs the method again as required.This means the information will definitely be POSTED again.
I also improved the user agent string, simply copying my current one on the basis it won’t be blocked for a long while as in 2012 it is in active use!
Here is a final copy of the curl class (in case the link dies – been down voted for that in the past) which is working: