I’m trying to scrape some data off of the FEC.gov website using python for

Question

0

Asked: May 30, 20262026-05-30T10:15:02+00:00 2026-05-30T10:15:02+00:00

I’m trying to scrape some data off of the FEC.gov website using python for

0

I’m trying to scrape some data off of the FEC.gov website using python for a project of mine. Normally I use python mechanize and beautifulsoup to do the scraping.

I’ve been able to figure out most of the issues but can’t seem to get around a problem. It seems like the data is streamed into the table and mechanize.Browser() just stops listening.

So here’s the issue:
If you visit http://query.nictusa.com/cgi-bin/can_ind/2011_P80003338/1/A … you get the first 500 contributors whose last name starts with A and have given money to candidate P80003338 … however, if you use browser.open() at that url all you get is the first ~5 rows.

I’m guessing its because mechanize isn’t letting the page fully load before the .read() is executed. I tried putting a time.sleep(10) between the .open() and .read() but that didn’t make much difference.

And I checked, there’s no javascript or AJAX in the website (or at least none are visible when you use the ‘view-source’). SO I don’t think its a javascript issue.

Any thoughts or suggestions? I could use selenium or something similar but that’s something that I’m trying to avoid.

-Will

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T10:15:03+00:00

Why not use an html parser like lxml with xpath expressions.

I tried

>>> import lxml.html as lh
>>> data = lh.parse('http://query.nictusa.com/cgi-bin/can_ind/2011_P80003338/1/A')
>>> name = data.xpath('/html/body/table[2]/tr[5]/td[1]/a/text()')
>>> name
[' AABY, TRYGVE']
>>> name = data.xpath('//table[2]/*/td[1]/a/text()')
>>> len(name)
500
>>> name[499]
' AHMED, ASHFAQ'
>>>

Similarly, you can create xpath expression of your choice to work with.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to scrape some data off of the FEC.gov website using python for

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply