Yes that sounds overly complicated. I am trying to mine data from pages on

Question

0

Asked: May 28, 20262026-05-28T17:39:54+00:00 2026-05-28T17:39:54+00:00

Yes that sounds overly complicated. I am trying to mine data from pages on

0

Yes that sounds overly complicated.

I am trying to mine data from pages on our intranet. The pages are secure. The connection is refused when I try to get the contents with urllib.urlopen().

So I would like to use python to open a web browser to open the site then click some links that trigger javascript pop ups containing tables of info that I want to collect.

Any suggestions on where to begin?

I know the format of the page. It is something like this:

<div id="list">
    <ul id="list item">
        <li><a onclict="Openpopup('1');">blah</a></li>
    </ul>
    <ul></ul>
    etc

Then a hidden frame becomes visible and the fields in the table within are filled.

<div>
    <table>
       <tr><td><span id="info_i_want">...

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T17:39:55+00:00

First off, I suggest that it’s better to figure out what the page needs that JS is providing, and fake that – you’ll have an easier time scraping the page if a browser isn’t involved.

If it’s just Javascript making an XMLHttpRequest, you can find the page from which the Javascript fetches the iframe data and connect directly to that.

But in spite of that you may need a library that does Javascript execution (if the reverse-engineering is too hard or it uses challenge tokens). A web-rendering framework like Gecko or WebKit might be appropriate.

Take a good look at Selenium if you insist on using a true web browser or cannot get the programmatic methods to work.

Once you’ve gotten the page contents via whatever method, you need an HTML parser (such as sgmllib or [almost] xml.dom). I suggest a DOM library. Parse the DOM and extract the contents from the appropriate node in the resulting tree.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Yes that sounds overly complicated. I am trying to mine data from pages on

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply