I am working on some scraping app, i wanted to try to get it

Question

0

Asked: May 14, 20262026-05-14T05:35:39+00:00 2026-05-14T05:35:39+00:00

I am working on some scraping app, i wanted to try to get it

0

I am working on some scraping app, i wanted to try to get it to work but ran into a problem. I have replaced the original scraping destination in the below code with googles webpage, just for testing. It seems that my download doesnt get everything, i note that the body and the html tags are missing their close tags. How do i get it to download everything? Whats wrong with my sample code:

string filename = "test.html";

WebClient client = new WebClient();            
string searchTerm = HttpUtility.UrlEncode(textBox2.Text);            
client.QueryString.Add("q", searchTerm);
client.QueryString.Add("hl", "en");
string data = client.DownloadString("http://www.google.com/search");

StreamWriter writer = new StreamWriter(filename, false, Encoding.Unicode);
writer.Write(data);
writer.Flush();
writer.Close();

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-14T05:35:39+00:00

Google’s web pages are now in HTML 5, meaning the BODY and HTML tags can be self-closed – which is why Google omits them (believe it or not, it saves them bandwidth.)

See this article.

You can write HTML5 in either “HTML/SGML” mode (which allows the omitting of closing tags like HTML did prior to XHTML) or in “XHTML” which follows the rules of XML, requiring all tags to be closed.

Which the browser chooses to parse the page depends on whether you send a Content-type header of text/html for HTML/SGML syntax or application/xhtml+xml for XHTML syntax. (Source: HTML5 syntax – HTML vs XHTML)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am working on some scraping app, i wanted to try to get it

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply