Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8790665
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T22:42:56+00:00 2026-06-13T22:42:56+00:00

When you map an iterable to a multiprocessing.Pool are the iterations divided into a

  • 0

When you map an iterable to a multiprocessing.Pool are the iterations divided into a queue for each process in the pool at the start, or is there a common queue from which a task is taken when a process comes free?

    def generate_stuff():
        for foo in range(100):
             yield foo

    def process(moo):
        print moo

    pool = multiprocessing.Pool()
    pool.map(func=process, iterable=generate_stuff())
    pool.close()

So given this untested suggestion code; if there are 4 processes in the pool does each process get allocated 25 stuffs to do, or do the 100 stuffs get picked off one by one by processes looking for stuff to do so that each process might do a different number of stuffs, eg 30, 26, 24, 20.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T22:42:57+00:00Added an answer on June 13, 2026 at 10:42 pm

    So given this untested suggestion code; if there are 4 processes in the pool does each process get allocated 25 stuffs to do, or do the 100 stuffs get picked off one by one by processes looking for stuff to do so that each process might do a different number of stuffs, eg 30, 26, 24, 20.

    Well, the obvious answer is to test it.

    As-is, the test may not tell you much, because the jobs are going to finish ASAP, and it’s possible that things will end up evenly distributed even if pooled processes grab jobs as they become ready. But there’s an easy way to fix that:

    import collections
    import multiprocessing
    import os
    import random
    import time
    
    def generate_stuff():
        for foo in range(100):
            yield foo
    
    def process(moo):
        #print moo
        time.sleep(random.randint(0, 50) / 10.)
        return os.getpid()
    
    pool = multiprocessing.Pool()
    pids = pool.map(func=process, iterable=generate_stuff(), chunksize=1)
    pool.close()
    print collections.Counter(pids)
    

    If the numbers are “jagged”, you know either that pooled processes must be grabbing new jobs as ready. (I explicitly set chunksize to 1 to make sure the chunks aren’t so big that each only gets one chunk in the first place.)

    When I run it on an 8-core machine:

    Counter({98935: 16, 98936: 16, 98939: 13, 98937: 12, 98942: 12, 98938: 11, 98940: 11, 98941: 9})
    

    So, it looks like the processes are getting new jobs on the fly.

    Since you specifically asked about 4 workers, I changed Pool() to Pool(4) and got this:

    Counter({98965: 31, 98962: 24, 98964: 23, 98963: 22})
    

    However, there’s an even better way to find out than by testing: read the source.

    As you can see, map just calls map_async, which creates a bunch of batches and puts them on a self._taskqueue object (a Queue.Queue instance). If you read further, this queue isn’t shared with the other processes directly, but there’s a pool manager thread that, whenever a process finishes and returns a result, pops the next job off the queue and submits it back to the process.

    This is also how you can find out what the default chunksize is for map. The 2.7 implementation linked above shows that it’s just len(iterable) / (len(self._pool) * 4) rounded up (slightly more verbose than that to avoid fractional arithmetic)—or, put another way, just big enough for about 4 chunks per process. But you really shouldn’t rely on this; the documentation vaguely and indirectly implies that it’s going to use some kind of heuristic, but doesn’t give you any guarantees as to what that will be. So, if you really need “about 4 chunks per process”, calculate it explicitly. More realistically, if you ever need anything besides the default, you probably need a domain-specific value that you’re going to work out (by calculation, guessing, or profiling).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an iterable, items of which map to multiple groups. How do I
We can use Maps.uniqueIndex() to create a Map out of any Iterable which is
I'm trying to process the contents of a tarfile using multiprocessing.Pool . I'm able
What is the difference between the map and flatMap functions of Iterable ?
The Python 2 documentation says: Built-in Functions: map(function, iterable, ...) Apply function to every
Consider such a map: Map(one -> Iterable(1,2,3,4), two -> Iterable(3,4,5), three -> Iterable(1,2)) I
In Python, a dictionary can be constructed from an iterable collection of tuples: >>>
[ Solved , it seems that there was some bug affecting Alfresco 3.3.0, which
The signature for map is map(function, iterable[, iterables[, ...]]) In Python 2.x if function
I just converted the following Java into Scala: char[] map = new char[64]; int

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.