Please note: I do not want to read the HTML content of a page,

Question

0

Asked: May 25, 20262026-05-25T06:55:53+00:00 2026-05-25T06:55:53+00:00

Please note: I do not want to read the HTML content of a page,

0

Please note: I do not want to read the HTML content of a page, rather, I am looking to read the text from a web page. Imagine the following example, if you will –

A PHP script echos back “Hello User X” onto the current page, so that the user is now looking at a page (mainly blank) with the words “Hello User X” printed in the top left corner. From my C# Application, I would like to read the text onto a string.

String strPageData = functionToReadPageData("http://www.myURL.com/file.php");

Console.WriteLine(strPageData); // Outputs "Hello User X" to the Console.

In VB6 I was able to do this by using the following API:

InternetOpen
InternetOpenURL
InternetReadFile
InternetCloseHandle

I attempted to port my VB6 code to C# but I am having no luck – so I would very much appreciate a C# method for completing the above task.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T06:55:54+00:00

I am not aware of any parts of the .NET framework that lets you automagically extract all the text from a HTML file. I very much doubt it exists.

You can try the HtmlAgilityPack (3rd party) for accessing text elements etc in a HTML document.

You will still need to write logic to find the correct HTML element though. A HTML page like this:

<html>
     <body>Some text</body>
</html>

Then you would need to locate the body tag with an xpath and read its content.

HtmlNode body = doc.DocumentElement.SelectNodes("//body");
string bodyContent = body.InnerText;

Following that pattern you can read every element on the page. You might need to do some post processing to remove breaks, comments etc.

http://htmlagilitypack.codeplex.com/wikipage?title=Examples

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Please note: I do not want to read the HTML content of a page,

In VB6 I was able to do this by using the following API:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply