I am implementing some satellite image filters, starting with one known as the Enhanced

Question

0

Asked: May 20, 20262026-05-20T05:24:48+00:00 2026-05-20T05:24:48+00:00

I am implementing some satellite image filters, starting with one known as the Enhanced

0

I am implementing some satellite image filters, starting with one known as the Enhanced Lee filter. The images are easily up to 5000×5000 pixels and more. My current implementation is running out of memory trying to compute the filters on those large arrays (note that the moving average and moving stddev filters can be run in one shot). The main difficulty is the number of arrays that must be kept in memory in order to return the final filtered array. In this question, I asked for help on refining a block-processing function, but my question is: is there a way of improving this code so that I don’t need to use block-processing?

    def moving_average(Ic, filtsize):
        Im = numpy.empty(Ic.shape, dtype='Float32')
        scipy.ndimage.filters.uniform_filter(Ic, filtsize, output=Im)
        return Im

    def moving_stddev(Ic, filtsize):
        Im = numpy.empty(Ic.shape, dtype='Float32')
        scipy.ndimage.filters.uniform_filter(Ic, filtsize, output=Im)
        S = numpy.empty(Ic.shape, dtype='Float32')
        scipy.ndimage.filters.uniform_filter(((Ic-Im) ** 2), filtsize, output=S)
        return numpy.sqrt(S)

    def enh_lee(Ic, filtsize, nlooks, dfactor):
        # Implementation based on PCI Geomatica's FELEE function documentation
        Ci = moving_stddev(Ic, filtsize) / moving_average(Ic, filtsize) #1st array in memory
        Cu = numpy.sqrt(1 / nlooks) #scalar
        Cmax = numpy.sqrt(1 + (2 * nlooks)) #scalar
        W = numpy.exp(-dfactor * (Ci - Cu) / (Cmax - Ci)) #2nd array in memory
        Im = moving_average(Ic, filtsize) #3rd array in memory
        If = Im * W + Ic * (1 - W) #4th array in memory
        W = None # Back to 3 arrays in memory
        return numpy.select([Ci <= Cu, (Cu < Ci) * (Ci < Cmax), Ci >= Cmax], [Im, If, Ic])

where nlooks and dfactor are scalars and Ic is the unfiltered array.

EDIT based on your suggestions (I am also looking at numexpr), my improved code for enh_lee is as follows, but is still not enough to get past the last step without running out of memory:

def enh_lee(Ic, filtsize, nlooks, dfactor):
    Im = moving_average(Ic, filtsize)
    Ci = moving_stddev(Ic, filtsize)
    Ci /= Im
    Cu = numpy.sqrt(1 / nlooks)
    Cmax = numpy.sqrt(1 + (2 * nlooks))

    W = Ci
    W -= Cu
    W /= Cmax - Ci
    W *= -dfactor
    numpy.exp(W, W)

    If = 1
    If -= W
    If *= Ic
    If += Im * W
    W = None
    return numpy.select([Ci <= Cu, (Cu < Ci) & (Ci < Cmax), Ci >= Cmax], [Im, If, Ic])

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-20T05:24:49+00:00

There are several (memory usage) optimizations you can make here… A few tricks to keep in mind are:

Most numpy functions take an out parameter than can be used to specify the output array instead of returning a copy. E.g. np.sqrt(x, x) will take the square root of an array in-place.
x += 1 uses half the memory that x = x + 1 does, as the latter makes a temporary copy. When it’s possible, try to split calculations up into *=, +=, /=, etc. When it’s not, use numexpr, as @eumiro suggested. (Or just used numexpr regardless… It’s quite handy in many cases.)

So, first off, here’s your original function’s performance with a 10000×10000 array of random data and a filtsize of 3:

Memory Usage Profile of Original Function
enter image description here

What’s interesting are the big spikes at the end. These occur during your numpy.select(...) bit. There a plenty of places where you’re inadvertently creating additional temporary arrays, but they’re mostly irrelevant, as they’re overwhelmed by what goes on during the call to select.

At any rate, if we replace your original (clean and compact) code with this rather verbose version, you can significantly optimize your memory usage:

import numpy
import scipy.ndimage

def main(x=None):
    if x is None:
        ni, nj = 10000, 10000
        x = numpy.arange(ni*nj, dtype=numpy.float32).reshape(ni,nj)
    filtsize = 3
    nlooks = 10.0
    dfactor = 10.0
    x = enh_lee(x, filtsize, nlooks, dfactor)
    return x

def moving_average(Ic, filtsize):
    Im = numpy.empty(Ic.shape, dtype='Float32')
    scipy.ndimage.filters.uniform_filter(Ic, filtsize, output=Im)
    return Im

def moving_stddev(Ic, filtsize):
    Im = numpy.empty(Ic.shape, dtype='Float32')
    scipy.ndimage.filters.uniform_filter(Ic, filtsize, output=Im)
    Im *= -1
    Im += Ic
    Im **= 2
    scipy.ndimage.filters.uniform_filter(Im, filtsize, output=Im)
    return numpy.sqrt(Im, Im)

def enh_lee(Ic, filtsize, nlooks, dfactor):
    # Implementation based on PCI Geomatica's FELEE function documentation
    Ci = moving_stddev(Ic, filtsize) 
    Im = moving_average(Ic, filtsize) 
    Ci /= Im

    Cu = numpy.sqrt(1 / nlooks).astype(numpy.float32) 
    Cmax = numpy.sqrt(1 + (2 * nlooks)).astype(numpy.float32) 

    W = Ci.copy()
    W -= Cu
    W *= -dfactor
    W /= Cmax - Ci
    W = numpy.exp(W, W)

    If = Im * W
    W *= -1
    W += 1
    W *= Ic
    If += W
    del W

    # Replace the call to numpy.select
    out = If
    filter = Ci <= Cu
    numpy.putmask(out, filter, Im)
    del Im

    filter = Ci >= Cmax
    numpy.putmask(out, filter, Ic)
    return out

if __name__ == '__main__':
    main()

Here’s the resulting memory profile for this code:

Memory Usage Profile of Numpy-based Optimized Version
enter image description here

So, we’ve greatly reduced memory usage, but the code is somewhat less readable (i.m.o.).

~~However, those last three peaks are the two numpy.where calls…~~

If numpy.where took an out parameter, we could further reduce the peak memory usage by another ~300Mb or so. Unfortunately, it doesn’t, and I don’t know of a more memory-efficient way to do it…

We can use numpy.putmask to replace the call to numpy.select and do the operation in-place (Thanks to @eumiro for mentioning this in an entirely different question.)

If we optimize things with numexpr, we get considerably cleaner code (compared to the pure-numpy version above, not the original). You could probably whittle the memory usage down a bit in this version… I’m not terribly familiar with numexpr, beyond having used it a few times.

import numpy
import scipy.ndimage
import numexpr as ne

def main(x=None):
    if x is None:
        ni, nj = 10000, 10000
        x = numpy.arange(ni*nj, dtype=numpy.float32).reshape(ni,nj)
    filtsize = 3
    nlooks = 10.0
    dfactor = 10.0
    x = enh_lee(x, filtsize, nlooks, dfactor)
    return x

def moving_average(Ic, filtsize):
    Im = numpy.empty(Ic.shape, dtype='Float32')
    scipy.ndimage.filters.uniform_filter(Ic, filtsize, output=Im)
    return Im

def moving_stddev(Ic, filtsize):
    Im = numpy.empty(Ic.shape, dtype='Float32')
    scipy.ndimage.filters.uniform_filter(Ic, filtsize, output=Im)
    Im = ne.evaluate('((Ic-Im) ** 2)')
    scipy.ndimage.filters.uniform_filter(Im, filtsize, output=Im)
    return ne.evaluate('sqrt(Im)')

def enh_lee(Ic, filtsize, nlooks, dfactor):
    # Implementation based on PCI Geomatica's FELEE function documentation
    Ci = moving_stddev(Ic, filtsize) 
    Im = moving_average(Ic, filtsize) 
    Ci /= Im

    Cu = numpy.sqrt(1 / nlooks).astype(numpy.float32) 
    Cmax = numpy.sqrt(1 + (2 * nlooks)).astype(numpy.float32) 

    W = ne.evaluate('exp(-dfactor * (Ci - Cu) / (Cmax - Ci))') 
    If = ne.evaluate('Im * W + Ic * (1 - W)') 
    del W

    out = ne.evaluate('where(Ci <= Cu, Im, If)') 
    del Im
    del If

    out = ne.evaluate('where(Ci >= Cmax, Ic, out)')
    return out

if __name__ == '__main__':
    main()

And here’s the memory-usage profile for the numexpr version: (Note that the execution time has been more than halved compared to the original!)

Memory Usage Profile of Numexpr-based Optimized Version*
enter image description here

The largest memory usage is still during the calls to where (replacing the call to select). However, the peak memory usage has been significantly cut. The easiest way to further reduce it would be to find some way to select operate in-place on one of the arrays. It would be fairly easy to do this with cython (nested loops would be rather slow in pure python, and any sort of boolean indexing in numpy will create an additional copy). You may be better off by simply chunking the input array as you’ve been doing, though…

Just as as side note, the updated versions produce the same output as the original code. There was a typo in the original numpy-based code…

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am implementing some satellite image filters, starting with one known as the Enhanced

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply