Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7277957
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T22:48:38+00:00 2026-05-28T22:48:38+00:00

I have a sum that I’m trying to compute, and I’m having difficulty parallelizing

  • 0

I have a sum that I’m trying to compute, and I’m having difficulty parallelizing the code. The calculation I’m trying to parallelize is kind of complex (it uses both numpy arrays and scipy sparse matrices). It spits out a numpy array, and I want to sum the output arrays from about 1000 calculations. Ideally, I would keep a running sum over all the iterations. However, I haven’t been able to figure out how to do this.

So far, I’ve tried using joblib’s Parallel function and the pool.map function with python’s multiprocessing package. For both of these, I use an inner function that returns a numpy array. These functions return a list, which I convert to a numpy array and then sum over.

However, after the joblib Parallel function completes all iterations, the main program never continues running (it looks like the original job is in a suspended state, using 0% CPU). When I use pool.map, I get memory errors after all the iterations are complete.

Is there a way to simply parallelize a running sum of arrays?

Edit: The goal is to do something like the following, except in parallel.

def summers(num_iters):

    sumArr = np.zeros((1,512*512)) #initialize sum
    for index in range(num_iters):
        sumArr = sumArr + computation(index) #computation returns a 1 x 512^2 numpy array

    return sumArr
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T22:48:38+00:00Added an answer on May 28, 2026 at 10:48 pm

    I figured out how to do parallelize a sum of arrays with multiprocessing, apply_async, and callbacks, so I’m posting this here for other people. I used the example page for Parallel Python for the Sum callback class, although I did not actually use that package for implementation. It gave me the idea of using callbacks, though. Here’s the simplified code for what I ended up using, and it does what I wanted it to do.

    import multiprocessing
    import numpy as np
    import thread
    
    class Sum: #again, this class is from ParallelPython's example code (I modified for an array and added comments)
        def __init__(self):
            self.value = np.zeros((1,512*512)) #this is the initialization of the sum
            self.lock = thread.allocate_lock()
            self.count = 0
    
        def add(self,value):
            self.count += 1
            self.lock.acquire() #lock so sum is correct if two processes return at same time
            self.value += value #the actual summation
            self.lock.release()
    
    def computation(index):
        array1 = np.ones((1,512*512))*index #this is where the array-returning computation goes
        return array1
    
    def summers(num_iters):
        pool = multiprocessing.Pool(processes=8)
    
        sumArr = Sum() #create an instance of callback class and zero the sum
        for index in range(num_iters):
            singlepoolresult = pool.apply_async(computation,(index,),callback=sumArr.add)
    
        pool.close()
        pool.join() #waits for all the processes to finish
    
        return sumArr.value
    

    I was also able to get this working using a parallelized map, which was suggested in another answer. I had tried this earlier, but I wasn’t implementing it correctly. Both ways work, and I think this answer explains the issue of which method to use (map or apply.async) pretty well. For the map version, you don’t need to define the class Sum and the summers function becomes

    def summers(num_iters):
        pool = multiprocessing.Pool(processes=8)
    
        outputArr = np.zeros((num_iters,1,512*512)) #you wouldn't have to initialize these
        sumArr = np.zeros((1,512*512))              #but I do to make sure I have the memory
    
        outputArr = np.array(pool.map(computation, range(num_iters)))
        sumArr = outputArr.sum(0)
    
        pool.close() #not sure if this is still needed since map waits for all iterations
    
        return sumArr
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a serial code that looks something like that: sum = a; sum
Say I have a list and I want it arranged so that the sum
I would like to create a safe sum extension method that would have the
I have one field that I need to sum lets say named items However
Alright. I have a query that looks like this: SELECT SUM(`order_items`.`quantity`) as `count`, `menu_items`.`name`
I'm trying to optimize an (infinite) computation algorithm. I have an infinte Sum to
this is a solution for the subset sum problem. It uses backtracking. I have
say you have a source file named sum.c that looks like this: #include sum.h
In my webapp, we have many fields that sum up other fields, and those
I have an SSRS grid report with a header row that SUM() s select

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.