Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7854305
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T20:01:12+00:00 2026-06-02T20:01:12+00:00

Is it efficient to calculate many results in parallel with multiprocessing.Pool.map() in a situation

  • 0

Is it efficient to calculate many results in parallel with multiprocessing.Pool.map() in a situation where each input value is large (say 500 MB), but where input values general contain the same large object? I am afraid that the way multiprocessing works is by sending a pickled version of each input value to each worker process in the pool. If no optimization is performed, this would mean sending a lot of data for each input value in map(). Is this the case? I quickly had a look at the multiprocessing code but did not find anything obvious.

More generally, what simple parallelization strategy would you recommend so as to do a map() on say 10,000 values, each of them being a tuple (vector, very_large_matrix), where the vectors are always different, but where there are say only 5 different very large matrices?

PS: the big input matrices actually appear “progressively”: 2,000 vectors are first sent along with the first matrix, then 2,000 vectors are sent with the second matrix, etc.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T20:01:13+00:00Added an answer on June 2, 2026 at 8:01 pm

    I think that the obvious solution is to send a reference to the very_large_matrix instead of a clone of the object itself? If there are only five big matrices, create them in the main process. Then when the multiprocessing.Pool is instantiated it will create a number of child processes that clones the parent process’ address space. That means that if there are six processes in the pool, there will be (1 + 6) * 5 different matrices in memory simultaneously.

    So in the main process create a lookup of all unique matrices:

    matrix_lookup = {1 : matrix(...), 2 : matrix(...), ...}
    

    Then pass the index of each matrix in the matrix_lookup along with the vectors to the worker processes:

    p = Pool(6)
    pool.map(func, [(vector, 1), (vector, 2), (vector, 1), ...])
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have many 100x100 grids, is there an efficient way using numpy to calculate
How efficient is the find() function on the std::map class? Does it iterate through
What is the most efficient way to set/re-order the TabIndex properties of many form
What is an efficient way of determining whether a BigDecimal is an integer value
I am looking for an efficient way to calculate the traffic speed for my
Is there an efficient way to calculate the matrix score for common neighbors(CC) and
What's the most efficient way to calculate the first day of the current (Australian)
I am trying to find the fastest and most efficient way to calculate slopes
What's the most efficient way to calculate the last day of the prior quarter?
What's an efficient way to calculate the next run time of an event given

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.