I’m currently iterating through a very large set of data ~85GB (~600M lines) and

Question

0

Asked: June 17, 20262026-06-17T12:17:18+00:00 2026-06-17T12:17:18+00:00

I’m currently iterating through a very large set of data ~85GB (~600M lines) and

0

I’m currently iterating through a very large set of data ~85GB (~600M lines) and simply using newton-raphson to compute a new parameter. As of right now my code is extremely slow, any tips on how to speed it up? The methods from BSCallClass & BSPutClass are closed-form, so there’s nothing really to speed up there. Thanks.

class NewtonRaphson:

    def __init__(self, theObject):
        self.theObject = theObject

    def solve(self, Target, Start, Tolerance, maxiter=500):
        y = self.theObject.Price(Start)
        x = Start
        i = 0
        while (abs(y - Target) > Tolerance):
            i += 1
            d = self.theObject.Vega(x)
            x += (Target - y) / d
            y = self.theObject.Price(x)
            if i > maxiter:
                x = nan
                break
        return x

    def main():
        for row in a.iterrows():
            print row[1]["X.1"]
            T = (row[1]["X.7"] - row[1]["X.8"]).days
            Spot = row[1]["X.2"]
            Strike = row[1]["X.9"]
            MktPrice = abs(row[1]["X.10"]-row[1]["X.11"])/2
            CPflag = row[1]["X.6"]

            if CPflag == 'call':
                option = BSCallClass(0, 0, T, Spot, Strike)
            elif CPflag == 'put':
                option = BSPutClass(0, 0, T, Spot, Strike)

            a["X.15"][row[0]] = NewtonRaphson(option).solve(MktPrice, .05, .0001)

EDIT:

For those curious, I ended up speeding this entire process significantly by using the scipy suggestion, as well as using the multiprocessing module.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T12:17:19+00:00

Don’t code your own Newton-Raphson method in Python. You’ll get better performance using one of the root finders in scipy.optimize such as brentq or newton.
(Presumably, if you have pandas, you’d also install scipy.)

Back of the envelope calculation:

Making 600M calls to brentq should be manageable on standard hardware:

import scipy.optimize as optimize
def f(x):
    return x**2 - 2

In [28]: %timeit optimize.brentq(f, 0, 10)
100000 loops, best of 3: 4.86 us per loop

So if each call to optimize.brentq takes 4.86 microseconds, 600M calls will take about 4.86 * 600 ~ 3000 seconds ~ 1 hour.

newton may be slower, but still manageable:

def f(x):
    return x**2 - 2
def fprime(x):
    return 2*x

In [40]: %timeit optimize.newton(f, 10, fprime)
100000 loops, best of 3: 8.22 us per loop

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m currently iterating through a very large set of data ~85GB (~600M lines) and

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply