I wrote a neural network object in python that has a cost function and determines its gradients via back-propogation. I see a bunch of optimization functions here, but I have no idea how to implement them. I’m also having a hard time finding any example code to learn from.
Clearly I need to somehow tell it what parameters I’m trying to change, the cost function I’m trying to minimize, and then the gradient calculated by backprop. How do I tell, say, fmin_cg what’s what?
bonus question: where can I learn about the differences in uses of the various algorithms?
===== OK, update =====
This is what I have atm:
def train(self, x, y_vals, iters = 400):
t0 = concatenate((self.hid_t.reshape(-1), self.out_t.reshape(-1)), 1)
self.forward_prop(x, t0)
c = lambda v: self.cost(x, y_vals, v)
g = lambda v: self.back_prop(y_vals, v)
t_best = fmin_cg(c, t0, g, disp=True, maxiter=iters)
self.hid_t = reshape(t_best[:,:(hid_n * (in_n+1))], (hid_n, in_n+1))
self.out_t = reshape(t_best[:,(hid_n * (in_n+1)):], (out_n, hid_n+1))
And, this is the error it’s throwing:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "netset.py", line 27, in <module>
net.train(x,y)
File "neuralnet.py", line 60, in train
t_best = fmin_cg(c, t0, g, disp=True, maxiter=iters)
File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 952, in fmin_cg
res = _minimize_cg(f, x0, args, fprime, callback=callback, **opts)
File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 1017, in _minimize_cg
deltak = numpy.dot(gfk, gfk)
ValueError: matrices are not aligned
…Halp!
I never used fmin_cg. I guess v is your weight vector. I did not find an error in your code, when I read the documentation. But I searched for your error and I found this: matrices are not aligned Error: Python SciPy fmin_bfgs
In addition, I think it is not garantueed that g(v) is always calculated after c(v). Thus, your backpropagation function should forward propagate x once again:
Or you can just pass one function that returns the cost function and the gradient as a tuple to avoid two forward propagations as Issam Laradji mentioned.
Good articles about optimization algorithms for artificial neural networks are:
I can recommend Levenberg-Marquardt. This algorithms works really well. Unfortunately every iteration step has cubic complexity O(n^3).