Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8959843
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T15:32:12+00:00 2026-06-15T15:32:12+00:00

I’m implementing the Viola Jones algorithm for face detection. I’m having issues with the

  • 0

I’m implementing the Viola Jones algorithm for face detection. I’m having issues with the first part of the AdaBoost learning part of the algorithm.

The original paper states

The weak classifier selection algorithm proceeds as follows. For each feature, the examples are sorted based on feature value.

I’m currently working with a relatively small training set of 2000 positive images and 1000 negative images. The paper describes having data sets as large as 10,000.

The main purpose of AdaBoost is to decrease the number of features in a 24×24 window, which totals 160,000+. The algorithm works on these features and selects the best ones.

The paper describes that for each feature, it calculates its value on each image, and then sorts them based on value. What this means is I need to make a container for each feature and store the values of all the samples.

My problem is my program runs out of memory after evaluating only 10,000 of the features (only 6% of them). The overall size of all the containers will end up being 160,000*3000, which is in the billions. How am I supposed to implement this algorithm without running out of memory? I’ve increased the heap size, and it got me from 3% to 6%, I don’t think increasing it much more will work.

The paper implies that these sorted values are needed throughout the algorithm, so I can’t discard them after each feature.

Here’s my code so far

public static List<WeakClassifier> train(List<Image> positiveSamples, List<Image> negativeSamples, List<Feature> allFeatures, int T) {
    List<WeakClassifier> solution = new LinkedList<WeakClassifier>();

    // Initialize Weights for each sample, whether positive or negative
    float[] positiveWeights = new float[positiveSamples.size()];
    float[] negativeWeights = new float[negativeSamples.size()];

    float initialPositiveWeight = 0.5f / positiveWeights.length;
    float initialNegativeWeight = 0.5f / negativeWeights.length;

    for (int i = 0; i < positiveWeights.length; ++i) {
        positiveWeights[i] = initialPositiveWeight;
    }
    for (int i = 0; i < negativeWeights.length; ++i) {
        negativeWeights[i] = initialNegativeWeight;
    }

    // Each feature's value for each image
    List<List<FeatureValue>> featureValues = new LinkedList<List<FeatureValue>>();

    // For each feature get the values for each image, and sort them based off the value
    for (Feature feature : allFeatures) {
        List<FeatureValue> thisFeaturesValues = new LinkedList<FeatureValue>();

        int index = 0;
        for (Image positive : positiveSamples) {
            int value = positive.applyFeature(feature);
            thisFeaturesValues.add(new FeatureValue(index, value, true));
            ++index;
        }
        index = 0;
        for (Image negative : negativeSamples) {
            int value = negative.applyFeature(feature);
            thisFeaturesValues.add(new FeatureValue(index, value, false));
            ++index;
        }

        Collections.sort(thisFeaturesValues);

        // Add this feature to the list
        featureValues.add(thisFeaturesValues);
        ++currentFeature;
    }

    ... rest of code
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T15:32:14+00:00Added an answer on June 15, 2026 at 3:32 pm

    This should be the pseudocode for the selection of one of the weak classifiers:

    normalize the per-example weights  // one float per example
    
    for feature j from 1 to 45,396:
      // Training a weak classifier based on feature j.
      - Extract the feature's response from each training image (1 float per example)
      // This threshold selection and error computation is where sorting the examples
      // by feature response comes in.
      - Choose a threshold to best separate the positive from negative examples
      - Record the threshold and weighted error for this weak classifier
    
    choose the best feature j and threshold (lowest error)
    
    update the per-example weights
    

    Nowhere do you need to store billions of features. Just extract the feature responses on the fly on each iteration. You’re using integral images, so extraction is fast. That is the main memory bottleneck, and it’s not that much, just one integer for every pixel in every image… basically the same amount of storage as your images required.

    Even if you did just compute all the feature responses for all images and save them all so you don’t have to do that every iteration, that still only:

    • 45396 * 3000 * 4 bytes =~ 520 MB, or if you’re convinced there are 160000 possible features,
    • 160000 * 3000 * 4 bytes =~ 1.78 GB, or if you use 10000 training images,
    • 160000 * 10000 * 4 bytes =~ 5.96 GB

    Basically, you shouldn’t be running out of memory even if you do store all the feature values.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

link Im having trouble converting the html entites into html characters, (&# 8217;) i
We're building an app, our first using Rails 3, and we're having to build
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I'm trying to convert HTML to plain text. I get many &\#8217; &\#8220; etc.
I'm having trouble keeping the paragraph square between the quote marks. In firefox the
I'm making a simple page using Google Maps API 3. My first. One marker
Let's say I'm outputting a post title and in our database, it's Hello Y&#8217;all
I'm interested in microtypography issues on the web. I want a tool to fix:
I have a .ini file as follows: [playlist] numberofentries=2 File1=http://87.230.82.17:80 Title1=(#1 - 365/1400) Example

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.