I would be very grateful if anyone could help me out with this problem. I’ve got some C# code which reads in the contents of a web page for parsing later on. The code is:
private StringReader ReadInUrl(string url)
{
string result = string.Empty;
System.Net.HttpWebRequest request = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(url);
request.Method = "GET";
using (var stream = request.GetResponse().GetResponseStream())
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
result = reader.ReadToEnd();
}
return new StringReader(result);
}
The code works fine with most pages, but throws a ‘The remote server returned an error: (500) Internal Server Error.’ with some pages. An example of a page that throws the error would be : http://www.thehut.com/blu-ray/harry-potter-collection-years-1-6/10061821.html
The thing that confuses me is that I can view the page fine using a webbrowser, and I can also grab the contents of the file using PHP fopen and fread, and then parse it in PHP.
I really need to be able to do this in C# and I’m stumped as to why it is happening. If any one could let me know why I can read in the page using PHP and not C#, and whether there is a setting in C# that could get round this issue? Any answers gratefully received!
The web site drops requests that doesn’t specify a user agent. So you need to specify it. Also I would recommend you using WebClient instead of
HttpWebRequest,HttpWebResponse,StreamReader,StringReaderand company:it’s kinda shorter and works.