This Python code pipes data through Perl script fine.
import subprocess
kw = {}
kw['executable'] = None
kw['shell'] = True
kw['stdin'] = None
kw['stdout'] = subprocess.PIPE
kw['stderr'] = subprocess.PIPE
args = ' '.join(['/usr/bin/perl','-w','/path/script.perl','<','/path/mydata'])
subproc = subprocess.Popen(args,**kw)
for line in iter(subproc.stdout.readline, ''):
print line.rstrip().decode('UTF-8')
However, it requires that I first to save my buffers to a disk file (/path/mydata). It’s cleaner to loop through the data in Python code and pass line-by-line to the subprocess like this:
import subprocess
kw = {}
kw['executable'] = '/usr/bin/perl'
kw['shell'] = False
kw['stderr'] = subprocess.PIPE
kw['stdin'] = subprocess.PIPE
kw['stdout'] = subprocess.PIPE
args = ['-w','/path/script.perl',]
subproc = subprocess.Popen(args,**kw)
f = codecs.open('/path/mydata','r','UTF-8')
for line in f:
subproc.stdin.write('%s\n'%(line.strip().encode('UTF-8')))
print line.strip() ### code hangs after printing this ###
for line in iter(subproc.stdout.readline, ''):
print line.rstrip().decode('UTF-8')
subproc.terminate()
f.close()
The code hangs with the readline after sending the first line to the subprocess. I have other executables that use this exact same code perfectly.
My data files can be quite large (1.5 GB) Is there way to accomplish piping the data without saving to file? I don’t want to re-write the perl script for compatibility with other systems.
Thanks srgerg. I had also tried the threading solution. This solution alone, however, always hung. Both my previous code and srgerg’s code were missing the final solution, Your tip gave me one last idea.
The final solution writes enough dummy data force the final valid lines from the buffer. To support this, I added code that tracks how many valid lines were written to stdin. The threaded loop opens the output file, saves the data, and breaks when the read lines equal the valid input lines. This solution ensures it reads and writes line-by-line for any size file.