Working on a python scraper/spider and encountered a URL that exceeds the char limit

Question

0

Asked: June 18, 20262026-06-18T21:55:20+00:00 2026-06-18T21:55:20+00:00

Working on a python scraper/spider and encountered a URL that exceeds the char limit

0

Working on a python scraper/spider and encountered a URL that exceeds the char limit with the titled IOError. Using httplib2 and when I attempt to retrieve the URL I receive a file name too long error. I prefer to have all of my projects within the home directory since I am using Dropbox. Anyway around this issue or should I just setup my working directory outside of home?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T21:55:21+00:00

The fact that the filename that’s too long starts with '.cache/www.example.com' explains the problem.

httplib2 optionally caches requests that you make. You’ve enabled caching, and you’ve given it .cache as the cache directory.

The easy solution is to put the cache directory somewhere else.

Without seeing your code, it’s impossible to tell you how to fix it. But it should be trivial. The documentation for FileCache shows that it takes a dir_name as the first parameter.

Or, alternatively, you can pass a safe function that lets you generate a filename from the URI, overriding the default. That would allow you to generate filenames that fit within the 144-character limit for Ubuntu encrypted fs.

Or, alternatively, you can create your own object with the same interface as FileCache and pass that to the Http object to use as a cache. For example, you could use tempfile to create random filenames, and store a mapping of URLs to filenames in an anydbm or sqlite3 database.

A final alternative is to just turn off caching, of course.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Working on a python scraper/spider and encountered a URL that exceeds the char limit

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply