Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6130797
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T16:54:00+00:00 2026-05-23T16:54:00+00:00

I am working on performing image processing using Numpy, specifically a running standard deviation

  • 0

I am working on performing image processing using Numpy, specifically a running standard deviation stretch. This reads in X number of columns, finds the Std. and performs a percentage linear stretch. It then iterates to the next “group” of columns and performs the same operations. The input image is a 1GB, 32-bit, single band raster which is taking quite a long time to process (hours). Below is the code.

I realize that I have 3 nested for loops which is, presumably where the bottleneck is occurring. If I process the image in “boxes”, that is to say loading an array that is [500,500] and iterating through the image processing time is quite short. Unfortunately, camera error requires that I iterate in extremely long strips (52,000 x 4) (y,x) to avoid banding.

Any suggestions on speeding this up would be appreciated:

def box(dataset, outdataset, sampleSize, n):

    quiet = 0
    sample = sampleSize
    #iterate over all of the bands
    for j in xrange(1, dataset.RasterCount + 1): #1 based counter

        band = dataset.GetRasterBand(j)
        NDV = band.GetNoDataValue()

        print "Processing band: " + str(j)       

        #define the interval at which blocks are created
        intervalY = int(band.YSize/1)    
        intervalX = int(band.XSize/2000) #to be changed to sampleSize when working

        #iterate through the rows
        scanBlockCounter = 0

        for i in xrange(0,band.YSize,intervalY):

            #If the next i is going to fail due to the edge of the image/array
            if i + (intervalY*2) < band.YSize:
                numberRows = intervalY
            else:
                numberRows = band.YSize - i

            for h in xrange(0,band.XSize, intervalX):

                if h + (intervalX*2) < band.XSize:
                    numberColumns = intervalX
                else:
                    numberColumns = band.XSize - h

                scanBlock = band.ReadAsArray(h,i,numberColumns, numberRows).astype(numpy.float)

                standardDeviation = numpy.std(scanBlock)
                mean = numpy.mean(scanBlock)

                newMin = mean - (standardDeviation * n)
                newMax = mean + (standardDeviation * n)

                outputBlock = ((scanBlock - newMin)/(newMax-newMin))*255
                outRaster = outdataset.GetRasterBand(j).WriteArray(outputBlock,h,i)#array, xOffset, yOffset


                scanBlockCounter = scanBlockCounter + 1
                #print str(scanBlockCounter) + ": " + str(scanBlock.shape) + str(h)+ ", " + str(intervalX)
                if numberColumns == band.XSize - h:
                    break

                #update progress line
                if not quiet:
                    gdal.TermProgress_nocb( (float(h+1) / band.YSize) )

Here is an update:
Without using the profile module, as I did not want to start wrapping small sections of the code into functions I used a mix of print and exit statements to get a really rough idea about which lines were taking the most time. Luckily (and I do understand how lucky I was) one line was dragging everything down.

    outRaster = outdataset.GetRasterBand(j).WriteArray(outputBlock,h,i)#array, xOffset, yOffset

It appears that GDAL is quite inefficient when opening the output file and writing out the array. With this in mind I decided to add my modified arrays “outBlock” to a python list, then write out chunks. Here is the segment that I changed:

The outputBlock was just modified …

         #Add the array to a list (tuple)
            outputArrayList.append(outputBlock)

            #Check the interval counter and if it is "time" write out the array
            if len(outputArrayList) >= (intervalX * writeSize) or finisher == 1:

                #Convert the tuple to a numpy array.  Here we horizontally stack the tuple of arrays.
                stacked = numpy.hstack(outputArrayList)

                #Write out the array
                outRaster = outdataset.GetRasterBand(j).WriteArray(stacked,xOffset,i)#array, xOffset, yOffset
                xOffset = xOffset + (intervalX*(intervalX * writeSize))

                #Cleanup to conserve memory
                outputArrayList = list()
                stacked = None
                finisher=0

Finisher is simply a flag that handles the edges. It took a bit of time to figure out how to build an array from the list. In that, using numpy.array was creating a 3-d array (anyone care to explain why?) and write array requires a 2d array. Total processing time is now varying from just under 2 minutes to 5 minutes. Any idea why the range of times might exist?

Many thanks to everyone who posted! The next step is to really get into Numpy and learn about vectorization for additional optimization.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T16:54:00+00:00Added an answer on May 23, 2026 at 4:54 pm

    Answered as requested.

    If you are IO bound, you should chunk your reads/writes. Try dumping ~500 MB of data to an ndarray, process it all, write it out and then grab the next ~500 MB. Make sure to reuse the ndarray.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

ASP.NET application performing oAuth with Facebook. This code was working fine until a few
I want to tackle some image-processing problems in Haskell. I'm working with both bitonal
I've been using CGPostMouseEvent() for performing mouse clicks since mac os 10.4. While working
Working in Eclipse on a Dynamic Web Project (using Tomcat (v5.5) as the app
Working with an Oracle 9i database from an ASP.NET 2.0 (VB) application using OLEDB.
I have a weird problem with ExecuteQuery in that it isn't working when performing
I am working in an application development. On that application i am performing files
I am working in a desktop application. In my application i am performing login
I'm working on a simulation project using c++ on both windows and linux. There
I'm not sure why this code isn't working. Its not going to the copy

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.