I would like to scrape a web site. It has the following in it’s

Question

0

Asked: June 17, 20262026-06-17T09:16:46+00:00 2026-06-17T09:16:46+00:00

I would like to scrape a web site. It has the following in it’s

0

I would like to scrape a web site. It has the following in it’s robots.txtfile, but I’m not exactly sure what it is they don’t want me to do:

User-agent: *
Disallow: /click

There is no click subdirectory. Or they don’t want me to access anything that would normally require clicking (like submitting data via a form)? They sure aren’t making it easy in any case – the main page’s form GETS to a site that sets a cookie that is read by a third page.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T09:16:47+00:00

It means that no bot should crawl any URLs whose paths start with the string click.

For example, the following URLs should be blocked:

example.com/click
example.com/click.html
example.com/click/
example.com/click/foo/bar
example.com/clicker

The following URLs would still be allowed:

example.com/foo/click
example.com/fooclick
example.com/clic

You can find the original robots.txt specification at http://www.robotstxt.org/wc/robots.html.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I would like to scrape a web site. It has the following in it’s

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply