Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8798597
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 14, 20262026-06-14T00:07:55+00:00 2026-06-14T00:07:55+00:00

I have a set of command line tools that I’d like to run in

  • 0

I have a set of command line tools that I’d like to run in parallel on a series of files. I’ve written a python function to wrap them that looks something like this:

def process_file(fn):
    print os.getpid()
    cmd1 = "echo "+fn
    p = subprocess.Popen(shlex.split(cmd1))

    # after cmd1 finishes
    other_python_function_to_do_something_to_file(fn)

    cmd2 = "echo "+fn
    p = subprocess.Popen(shlex.split(cmd2))
    print "finish"

if __name__=="__main__":
    import multiprocessing
    p = multiprocessing.Pool()
    for fn in files:
        RETURN = p.apply_async(process_file,args=(fn,),kwds={some_kwds})

While this works, it does not seem to be running multiple processes; it seems like it’s just running in serial (I’ve tried using Pool(5) with the same result). What am I missing? Are the calls to Popen “blocking”?

EDIT: Clarified a little. I need cmd1, then some python command, then cmd2, to execute in sequence on each file.

EDIT2: The output from the above has the pattern:

pid
finish
pid
finish
pid
finish

whereas a similar call, using map in place of apply (but without any provision for passing kwds) looks more like

pid
pid
pid
finish
finish
finish

However, the map call sometimes (always?) hangs after apparently succeeding

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-14T00:07:56+00:00Added an answer on June 14, 2026 at 12:07 am

    Are the calls to Popen “blocking”?

    No. Just creating a subprocess.Popen returns immediately, giving you an object that you could wait on or otherwise use. If you want to block, that’s simple:

    subprocess.check_call(shlex.split(cmd1))
    

    Meanwhile, I’m not sure why you’re putting your args together into a string and then trying to shlex them back to a list. Why not just write the list?

    cmd1 = ["echo", fn]
    subprocess.check_call(cmd1)
    

    While this works, it does not seem to be running multiple processes; it seems like it’s just running in serial

    What makes you think this? Given that each process just kicks off two processes into the background as fast as possible, it’s going to be pretty hard to tell whether they’re running in parallel.

    If you want to verify that you’re getting work from multiple processing, you may want to add some prints or logging (and throw something like os.getpid() into the messages).

    Meanwhile, it looks like you’re trying to exactly duplicate the effects of multiprocessing.Pool.map_async out of a loop around multiprocessing.Pool.apply_async, except that instead of accumulating the results you’re stashing each one in a variable called RESULT and then throwing it away before you can use it. Why not just use map_async?

    Finally, you asked whether multiprocessing is the right tool for the job. Well, you clearly need something asynchronous: check_call(args(file1)) has to block other_python_function_to_do_something_to_file(file1), but at the same time not block check_call(args(file2)).

    I would probably have used threading, but really, it doesn’t make much difference. Even if you’re on a platform where process startup is expensive, you’re already paying that cost because the whole point is running N * M bunch of child processes, so another pool of 8 isn’t going to hurt anything. And there’s little risk of either accidentally creating races by sharing data between threads, or accidentally creating code that looks like it shares data between processes that doesn’t, since there’s nothing to share. So, whichever one you like more, go for it.

    The other alternative would be to write an event loop. Which I might actually start doing myself for this problem, but I’d regret it, and you shouldn’t do it…

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have scrip contain command line: set dir=%1 cd %dir% test.bat echo successful When
I have the below command line arguments set for the program. argument proc is
In my PowerShell script I'm calling some command line tools (not cmdlets) that may
I have tried setting the debug flags using the set command in cmake but
I have a LAMP server where I've run the following commands to set permissions
I have set the eclipse java formatter to wrap lines that exceed 120 characters
I have set a background on the data-role=page element like so <div data-role=page style=background:
I have set up Eclipse Indigo to do line debugging with CFEclipse over port
I have set of flat files (114 files) each file is named with database
I have a (java) program that prints a line of hex numbers to stdout

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.