Is there anyway I can parse a website by just viewing the content as

Question

0

Asked: May 16, 20262026-05-16T20:03:51+00:00 2026-05-16T20:03:51+00:00

Is there anyway I can parse a website by just viewing the content as

0

Is there anyway I can parse a website by just viewing the content as displayed to the user in his browser? That is, instead of downloading “page.htm”l and starting to parse the whole page with all the HTML/javascript tags, I will be able to retrieve the version as displayed to users in their browsers. I would like to “crawl” websites and rank them according to keywords popularity (viewing the HTML source version is problematic for that purpose).

Thanks!

Joel

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T20:03:52+00:00

Editorial Team

2026-05-16T20:03:52+00:00Added an answer on May 16, 2026 at 8:03 pm

A browser also downloads the page.html and then renders it. You should work the same way. Use a html parser like lxml.html or BeautifulSoup, using those you can ask for only the text enclosed within tags (and arguments you do like, like title and alt attributes).

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Is there anyway I can parse a website by just viewing the content as

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply