I’m trying to get HTML code from a specific webpage, but when I do it using
HttpWebRequest request;
HttpWebResponse response;
StreamReader streamReader;
request = (HttpWebRequest)WebRequest.Create(pageURL);
response = (HttpWebResponse)request.GetResponse();
streamReader = new StreamReader(response.GetResponseStream(), Encoding.GetEncoding("windows-1251"));
htmlCode = streamReader.ReadToEnd();
streamReader.Close();
or using WebClient, I get redirected to a login page and I get its code.
Is there any other way to get HTML code?
I read some information here: How to get HTML from a current request, in a postback , but didn’t understand what should I do, or how and where to specify URL.
P.S.:
I’m logged-in in a browser. Notepad++ perfectly gets what I need via “right click – view source code”.
Thanks.
If you get redirected to a login page, then presumably you must be logged in before you can get the content.
So you need to make a request, with suitable credentials, to the login page. Get whatever tokens are sent (usually in the form of cookies) to maintain the login. Then request the page you want (sending the cookies with the request).
Alternatively (and this is the preferred approach), most major sites that expect automated systems to interact with them provide an API (often using OAuth for authentication). Consult their documentation to see how their API works.