Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7712795
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T01:43:08+00:00 2026-06-01T01:43:08+00:00

I have a large file that needs to be processed before feeding to another

  • 0

I have a large file that needs to be processed before feeding to another command. I could save the processed data as a temporary file but would like to avoid it. I wrote a generator that processes each line at a time then following script to feed to the external command as input. however I got “I/O operation on closed file” exception at the second round of the loop:

cmd = ['intersectBed', '-a', 'stdin', '-b', bedfile]
p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for entry in my_entry_generator: # <- this is my generator
    output = p.communicate(input='\t'.join(entry) + '\n')[0]
    print output

I read another similar question that uses p.stdin.write. but still had the same problem.

What I did wrong?

[edit]
I replaced last two statements with following (thanks SpliFF):

    output = p.communicate(input='\t'.join(entry) + '\n')
    if output[1]: print "error:", output[1]
    else: print output[0]

to see if there was any error by the external program. But no.
Still have the same exception at p.communicate line.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T01:43:09+00:00Added an answer on June 1, 2026 at 1:43 am

    The communicate method of subprocess.Popen objects can only be called once. What it does is it sends the input you give it to the process while reading all the stdout and stderr output. And by “all”, I mean it waits for the process to exit so that it knows it has all output. Once communicate returns, the process no longer exists.

    If you want to use communicate, you have to either restart the process in the loop, or give it a single string that is all the input from the generator. If you want to do streaming communication, sending data bit by bit, then you have to not use communicate. Instead, you would need to write to p.stdin while reading from p.stdout and p.stderr. Doing this is tricky, because you can’t tell which output is caused by which input, and because you can easily run into deadlocks. There are third-party libraries that can help you with this, like Twisted.

    If you want to do this interactively, sending some data and then waiting for and processing the result before sending more data, things get even harder. You should probably use a third-party library like pexpect for that.

    Of course, if you can get away with just starting the process inside the loop, that would be a lot easier:

    cmd = ['intersectBed', '-a', 'stdin', '-b', bedfile]
    for entry in my_entry_generator:
        p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        output = p.communicate(input='\t'.join(entry) + '\n')[0]
        print output
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a task endpoint that needs to process data (say >1MB file) uploaded
I have a very large file that looks like this (see below). I have
I have a rather large text file that has a bunch of missing newlines,
I have a large config file (user) that i needed to go to the
I have a class that parses very large file (that can't fit in memory)
I have a large file in my repository that is not text-mergeable and that
I have a large source file in Perforce that has been split up into
I have a microcontroller that must download a large file from a PC serial
I have a Java application that makes heavy use of a large file, to
I have to verify the signature on a file that may be as large

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.