I need to train a network to multiply or add 2 inputs, but it doesn’t seem to approximate well for all points after 20000
iterations. More specifically, I train it on the whole dataset and it approximates well for the last points, but it seems
like it isn’t getting any better for the first endpoints. I normalize the data so that it is between -0.8 and 0.8. The
network itself consists of 2 inputs 3 hidden neurons and 1 output neuron. I also set the network’s learning rate to 0.25,
and use as a learning function tanh(x).
It approximates really well for points that are trained last in the dataset, but for the first points it seems like it
can’t approximate well. I wonder what it is, that isn’t helping it adjust well, whether it is the topology I am using, or
something else?
Also how many neurons are appropriate in the hidden layer for this network?
Think about what would happen if you replaced your
tanh(x)threshold function with a linear function of x – call ita.x– and treataas the sole learning parameter in each neuron. That’s effectively what your network will be optimising towards; it’s an approximation of the zero-crossing of thetanhfunction.Now, what happens when you layer neurons of this linear type? You multiply the output of each neuron as the pulse goes from input to output. You’re trying to approximate addition with a set of multiplications. That, as they say, does not compute.