Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 930169
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T20:17:32+00:00 2026-05-15T20:17:32+00:00

I have a scraper, which queries different websites. Some of them varyingly use Content-Encoding.

  • 0

I have a scraper, which queries different websites. Some of them varyingly use Content-Encoding. And since I’m trying to simulate an AJAX query and need to mimic Mozilla, I need full support. There are multiple HTTP libraries for Python, but neither seems complete:

httplib seems pretty low level, more like a HTTP packet sniffer really.

urllib2 is some sort of elaborate hoax. There are a dozen handlers for various web client functions, but mandatory HTTP features like Content-Encoding appearantly aren’t.

mechanize: is nice, already somehwat overkill for my tasks, but only supports CE ‘gzip’.

httplib2: sounded most promising, but actually fails on ‘deflate’ encoding, because of the disparity of raw deflate and zlib streams.

So are there any other options? I can’t believe I’m expected to reimplement workarounds for above libraries. And it’s not a good idea to distribute patched versions alongside my application, because packagers might remove it again if the according library is available as separate distribution package.

I almost don’t dare to say, but the http functions API in PHP is much nicer. And besides Content-Encoding:*, I might somewhen need multipart/form-data too. So, is there a comprehensive 3rd party library for http retrieval?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T20:17:32+00:00Added an answer on May 15, 2026 at 8:17 pm

    I would consider either invoking a child process of cURL or using python bindings for libcurl.

    From this description cURL seems to support gzip and deflate.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Have a question regarding something which has been bugging me for some time now.I'm
i have a input tag which is non editable, but some times i need
Have just started using Visual Studio Professional's built-in unit testing features, which as I
I'm currently developing a TDD idmb html scraper which ill extract certain fields from
We are using a web scraper and have it set up to have a
With the help from two previous questions, I now have a working HTML scraper
I have a python script which scrapes a page and receives a cookie. I
I have a site scraped into $html variable. now i want to replace some
I have a script that appends some rows to a table. One of the
I have a login.jsp page which contains a login form. Once logged in the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.