I’m using python+mechanize, attempting to scrape a site. If I visit this site with

Question

0

Asked: June 2, 20262026-06-02T04:21:36+00:00 2026-06-02T04:21:36+00:00

I’m using python+mechanize, attempting to scrape a site. If I visit this site with

0

I’m using python+mechanize, attempting to scrape a site. If I visit this site with links, a text-only version of the login page appears. This is what I’d like to see with my scraper. So:

import mechanize

USER_AGENT = "Links (2.3pre1; Linux 2.6.32-5-xen-amd64 x86_64; 80x24)"
mech = mechanize.Browser(factory=mechanize.RobustFactory())
mech.addheaders = [('User-agent', USER_AGENT)]
mech.set_handle_robots(False)

resp = mech.open(URLS['start'])
fnout("001-login.html", resp.read())
resp.close()

fnout just dumps the string to a file. Yet, when I open 001-login.html, the entirety of the page is the word “Robot”. Nothing else.

I haven’t made any other requests. It’s not like I loaded the page & didn’t load the images, or whatever. This was the first request I made, and I put the User-Agent as exactly what the version of Links that the site worked with had. What am I doing wrong (besides trying to scrape a site that doesn’t want to be scrape, that is)?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-02T04:21:39+00:00

Editorial Team

2026-06-02T04:21:39+00:00Added an answer on June 2, 2026 at 4:21 am

Probably there are other headers that links is sending that Mechanize is not, or vice versa. Hit up http://www.reliply.org/tools/requestheaders.php with both links and Mechanize and see what headers are being sent.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using python+mechanize, attempting to scrape a site. If I visit this site with

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply