Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 662767
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T23:24:52+00:00 2026-05-13T23:24:52+00:00

I have a web scraping script that gets new data once every minute, but

  • 0

I have a web scraping script that gets new data once every minute, but over the course of a couple of days, the script ends up using 200mb or more of memory, and I found out it’s because mechanize is keeping an infinite browser history for the .back() function to use.

I have looked in the docstrings, and I found the clear_history() function of the browser class, and I invoke that each time I refresh, but I still get 2-3mb higher memory usage on each page refresh. edit: Hmm, seems as if it kept doing the same thing after I called clear_history, up until I got to about 30mb worth of memory usage, then it cleared back down to 10mb or so (which is the base amount of memory my program starts up with)…any way to force this behavior on a more regular basis?

How do I keep mechanize from storing all of this info? I don’t need to keep any of it. I’d like to keep my python script below 15mb memory usage.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T23:24:53+00:00Added an answer on May 13, 2026 at 11:24 pm

    You can pass an argument history=whatever when you instantiate the Browser; the default value is None which means the browser actually instantiates the History class (to allow back and reload). The simplest approach (will give an attribute error exception if you ever do call back or reload):

    class NoHistory(object):
      def add(self, *a, **k): pass
      def clear(self): pass
    
    b = mechanize.Browser(history=NoHistory())
    

    a cleaner approach would implement other methods in NoHistory to give clearer exceptions on erroneous use of the browser’s back or reload, but this simple one should suffice otherwise.

    Note that this is an elegant (though not well documented;-) use of the dependency injection design pattern: in a (bleah) “monkeypatching” world, the client code would be expected to overwrite b._history after the browser is instantiated, but with dependency injection you just pass in the “history” object you want to use. I’ve often maintained that Dependency Injection may be the most important DP that wasn’t in the “gang of 4” book!-).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm scraping data from the web, and I have several processes of my scraper
My application needs some web scraping functionality. I have URL object that downloads all
Say we have project that requires web scraping. (parsing strings (< 40) and scraping
I have a web scraping application, written in OO Perl. There's single WWW::Mechanize object
I have written a web scraping program to go to a list of pages
I'm creating a web service that often scrapes data from remote web pages. After
I'm writing an aggregation application which scrapes data from a couple of web sources
currently im making some web scraping script. and i was choice PAMIE to use
I'm writing a web scraping app in .NET and would like to have it
I have a web page. From that i want to find all the IMG

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.