Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7681415
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T18:20:11+00:00 2026-05-31T18:20:11+00:00

I am implementing a feed-forward Neural Network which trains using backpropagation. When I output

  • 0

I am implementing a feed-forward Neural Network which trains using backpropagation. When I output the error rate after each test case it learns – I notice that after a number of epochs it starts to learn certain test cases very well but others very badly. i.e. certain test cases have very low error but others have very high error.

Essentially, after a few epochs I notice that the Mean Squared Error stagnates to the following pattern- (each line represents MSE after a single testcase).

0.6666666657496451
0.6666666657514261
1.5039854423139616E-10
1.4871467103001578E-10
1.5192940136144856E-10
1.4951558809679557E-10
0.6666521719715195
1.514803547256445E-10
1.5231135866323182E-10
0.6666666657507451
1.539071732985272E-10

Could there be any possible reason(s) why this is happening ?

Initially I thought these cases causing high error rates could just be outliers – but there are too many of them as the pattern suggests. Could it be that my learner has just reached a local minima and needs some momentum to get out of it ?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T18:20:12+00:00Added an answer on May 31, 2026 at 6:20 pm

    My answer is directed to a possible solution to the “uneven” progress in training your Classifier. Regarding “why” you are seeing that behavior, i defer. In particular, i’m reluctant to attempt to attribute causes to artifacts i observe mid-training–i.e., is it the data? Or the MLP implementation? Or the tunable configuration i selected? The fact is that it’s the interaction of your Classifier with the data that has caused this observation, rather than some inherent feature in either one.

    It’s not uncommon for a classifier to learn certain input vectors quite well and also quite quickly–i.e., [observed – predicted]^2 becomes very small after only a small number of cycles/epochs–and for the same classifier to fail to repeatedly fail (and fail to improve) on other input vectors.

    To successfully complete the training of your Classifier, Boosting is the textbook answer for the problem described in your Question.

    Before going any further though, a small mistake in your configuration/setup could also account for the behavior you observed.

    In particular, perhaps verify these items in your config:

    • are your input vectors properly coded–e.g., so that their range is
      [-1, 1]?

    • have you correctly coded your response variables (i.e, 1-of-C
      coding
      )?

    • have you selected a reasonable initial learning rate and momentum
      term? And have you attempted training with learning rate values
      adjusted on either side of that initial learn rate? t

    In any event, assuming those configuration and setup issues are ok, Here are the relevant implementation details regarding Boosting (which is, strictly, a technique in which multiple classifiers are combined) works like this:

    after some number of epochs, examine the results (as you have been doing). *Those data vectors which the classifier has failed to learn are assigned a weighting factor to increase the erro*r (some number greater than 1); similarly, those data vectors that the classifier learned well are also assigned a weighting factor but here the value is less than one so that the importance of training error is reduced.

    So for instance, suppose at the end of the first epoch (iteration through all data vectors comprising your training data set) your total error is 100; in other words, the square error (observed value – predicted value) summed over all data vectors in the training set.

    These are two MSE values from among those listed in your Question

    0.667        # poorly learned input vector => assign error multiplier > 1 
    1.5e-10      # well-learned input vector => assign error multiplier < 1  
    

    In Boosting, you would find the input vectors that correspond to these two error measurements, and associate each an error weight; this weight will be greater than one in the first case, and less than one in the second. Let’s suppose you assign error weights of 1.3 and .7., respectively. Further suppose that after the next epoch, your classifier has not improved with respect to learning the first of these two input vectors–i.e., it returns the same predicted values as it did in the last epoch. For this iteration/epoch however, the contribution to total error from that input vector is not 0.67 but 1.3 x 0.67, or approx. .87.

    What is the effect of this increase error on training progress?

    Larger error means a steeper gradient and therefore for the next iteration, a larger adjustment to the appropriate weights comprising the weight matrices–in other words, more rapid training focused at this particular input vector.

    You might imagine that each of these data vectors has an implicit error weight of 1.0. Boosting just increases that error weight (for vectors that the classifier is unable to learn) and decreases this weight for vectors that it learns well.

    What i have just described is a particular implementation called AdaBoost, which is probably the best known implementation of Boosting. For guidance and even code for langauge-specific implementations, have a look at boosting.com]1 (seriously). This Site is no longer maintained though, so here are a couple of more excellent resources that i have relied on and can recommend highly. The first is an academic site in the form of an annotated bibliography (including links to the papers discussed on the site). The first paper listed on this Site (with a link to pdf), The boosting approach to machine learning: An overview, is an excellent overview and an efficient source to acquire a working knowledge of this family of techniques.

    There is also an excellent video tutorial on Boosting and AdaBoost at videolectures.net

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm having a problem implementing a Feed-Forward MultiLayer Perceptron, with back-prop learning in OpenCL
I'm implementing a Feed Project basically using Three20 Twitter sample as my reference but
implementing publishActivity in PHP using the REST API using this code: $activity = array(
Implementing a simple Login screen using JSF and Spring and Hibernate. I have written
We're using Quartz.Net to schedule about two hundred repeating jobs. Each job uses the
I'm working on implementing a feed generator for use with Google Product Search for
I am implementing an RSS search feature from a search engine, using Java and
I'm looking into implementing a web page to show the user's news feed with
Have you any idea about implementing 2D object recognition with MATLAB? Which characteristics of
I`m implementing a custom filesystem on Ubuntu using Fuse, but I need to trap

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.