I’m attempting to use pipes to communicate between processes in python. These processes will be called from different threads, and so may not have direct access to the Popen object for each process. I’ve written script below, as a simple proof of concept, but have found that my recieving process never terminates.
import os
import subprocess
import traceback
import shlex
if __name__ == '__main__':
(fd_out, fd_in) = os.pipe()
pipe_in = os.fdopen(fd_in, 'w')
pipe_out = os.fdopen(fd_out, 'r')
file_out = open('outfile.data', 'w+')
cmd1 = 'cat ' + ' '.join('parts/%s' % x for x in sorted(os.listdir('parts')))
cmd2 = 'pbzip2 -d -c'
pobj1 = subprocess.Popen(shlex.split(cmd1), stdout=pipe_in)
pobj2 = subprocess.Popen(shlex.split(cmd2), stdin=pipe_out,
stdout=file_out)
print 'closing pipe in'
pipe_in.close()
print 'closing pipe out'
pipe_out.close()
print 'closing file out'
file_out.close()
print 'waiting on process 2'
pobj2.wait()
print 'done'
This runs correctly in many ways. The data chunks get piped to the 2nd process, and the 2nd process decompresses the stream and writes it to a file. I can watch the processes until they seem to be just waiting (and doing nothing), terminate the 2nd process, and the file seems to be completely written.
So, I’m wondering why the 2nd process never terminates. It seems that it never realizes that the input stream has been closed. How do I close the pipe properly, so that the process knows to terminate?
david_clymer@zapazoid:/home/tmp/db$ python test.py
closing pipe in
closing pipe out
closing file out
waiting on process 2
^Z
[1]+ Stopped python test.py
david_clymer@zapazoid:/home/tmp/db$ bg
[1]+ python test.py &
david_clymer@zapazoid:/home/tmp/db$ jobs -l
[1]+ 31533 Running python test.py &
david_clymer@zapazoid:/home/tmp/db$ ps -fp 31533
UID PID PPID C STIME TTY TIME CMD
1000 31533 22536 0 15:22 pts/2 00:00:00 python test.py
david_clymer@zapazoid:/home/tmp/db$ lsof |grep $(pwd)
bash 3432 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
bash 22536 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
python 31533 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31536 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31536 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31537 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31537 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31538 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31538 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31539 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31539 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31540 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31540 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31541 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31541 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31542 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31542 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31543 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31543 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
pbzip2 31535 31544 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
pbzip2 31535 31544 david_clymer 1u REG 253,3 12255300000 397270 /home/tmp/db/outfile.data
lsof 31599 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
grep 31600 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
lsof 31602 david_clymer cwd DIR 253,3 483328 408117 /home/tmp/db
david_clymer@zapazoid:/home/tmp/db$ strace -p 31533
Process 31533 attached - interrupt to quit
wait4(31535, ^C <unfinished ...>
Process 31533 detached
I imagine I am doing something stupid. I’d like to know what, and why.
The second process is probably inheriting the input end of the pipe, which therefore never gets closed. I’m not a Python expert, but perhaps it’s possible to avoid this is by
Popening the second process first with astdin=PIPE, thenPopenthe first process with the second processes’s.stdinas itsstdout. (Popenprobably arranges for the process not to have a handle to the input end of the pipe that it creates internally.)In order to work around the file descriptor inheritance, call subprocess using
close_fds=True: