I’m currently iterating through a very large set of data ~85GB (~600M lines) and simply using newton-raphson to compute a new parameter. As of right now my code is extremely slow, any tips on how to speed it up? The methods from BSCallClass & BSPutClass are closed-form, so there’s nothing really to speed up there. Thanks.
class NewtonRaphson:
def __init__(self, theObject):
self.theObject = theObject
def solve(self, Target, Start, Tolerance, maxiter=500):
y = self.theObject.Price(Start)
x = Start
i = 0
while (abs(y - Target) > Tolerance):
i += 1
d = self.theObject.Vega(x)
x += (Target - y) / d
y = self.theObject.Price(x)
if i > maxiter:
x = nan
break
return x
def main():
for row in a.iterrows():
print row[1]["X.1"]
T = (row[1]["X.7"] - row[1]["X.8"]).days
Spot = row[1]["X.2"]
Strike = row[1]["X.9"]
MktPrice = abs(row[1]["X.10"]-row[1]["X.11"])/2
CPflag = row[1]["X.6"]
if CPflag == 'call':
option = BSCallClass(0, 0, T, Spot, Strike)
elif CPflag == 'put':
option = BSPutClass(0, 0, T, Spot, Strike)
a["X.15"][row[0]] = NewtonRaphson(option).solve(MktPrice, .05, .0001)
EDIT:
For those curious, I ended up speeding this entire process significantly by using the scipy suggestion, as well as using the multiprocessing module.
Don’t code your own Newton-Raphson method in Python. You’ll get better performance using one of the root finders in scipy.optimize such as brentq or newton.
(Presumably, if you have
pandas, you’d also installscipy.)Back of the envelope calculation:
Making 600M calls to brentq should be manageable on standard hardware:
So if each call to
optimize.brentqtakes 4.86 microseconds, 600M calls will take about 4.86 * 600 ~ 3000 seconds ~ 1 hour.newtonmay be slower, but still manageable: