Platform: Windows
Grep: http://gnuwin32.sourceforge.net/packages/grep.htm
Python: 2.7.2
Windows command prompt used to execute the commands.
I am searching for the for the following pattern "2345$" in a file.
Contents of the file are as follows:
abcd 2345
2345
abcd 2345$
grep "2345$" file.txt
grep returns 2 lines (first and second) successfully.
When I try to run the above command through python I don’t see any output.
Python code snippet is as follows:
temp = open('file.txt', "r+")
grep_cmd = []
grep_cmd.extend([grep, '"2345$"' ,temp.name])
print grep_cmd
p = subprocess.Popen(grep_cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdoutdata = p.communicate()[0]
print stdoutdata
If I have
grep_cmd.extend([grep, '2345$' ,temp.name])
in my python script, I get the correct answer.
The questions is why the grep command with "
grep_cmd.extend([grep, '"2345$"' ,temp.name])
executed from python fails. Isn’t python supposed to execute
the command as it is.
Thanks
Gudge.
Do not put double quotes around your pattern. It is only needed on the command line to quote shell metacharacters. When calling a program from python, you do not need this.
You also do not need to open the file yourself – grep will do that:
To understand the reason for the double quotes not being needed and causing your command to fail, you need to understand the purpose of the double quotes and how they are processed.
The shell uses double quotes to prevent special processing of some shell metacharacters. Shell metacharacters are those characters that the shell handles specially and does not pass literally to the programs it executes. The most commonly used shell metacharacter is “space”. The shell splits a command on space boundaries to build an argument vector to execute a program with. If you want to include a space in an argument, it must be quoted in some way (single or double quotes, backslash, etc). Another is the dollar sign ($), which is used to signify variable expansion.
When you are executing a program without the shell involved, all these rules about quoting and shell metacharacters are not relevant. In python, you are building the argument vector yourself, so the relevant quoting rules are python quoting rules (e.g. to include a double quote inside a double-quoted string, prefix the double quote with a backslash – the backslash will not be in the final string). The characters in each element of the argument vector when you have completed constructing it are the literal characters that will be passed to the program you are executing.
Grep does not treat double quotes as special characters, so if grep gets double quotes in its search pattern, it will attempt to match double quotes from its input.
My original answer’s reference to
shell=Truewas incorrect – first I did not notice that you had originally specifiedshell=True, and secondly I was coming from the perspective of a Unix/Linux implementation, not Windows.The python subprocess module page has this to say about
shell=Trueand Windows:That linked section on converting an argument sequence to a string on Windows does not make sense to me. First, a string is a sequence, and so is a list, yet the Frequently Used Arguments section says this about arguments:
This contradicts the conversion process described in the Python documentation, and given the behaviour you have observed, I’d say the documentation is wrong, and only applied to a argument string, not an argument vector. I cannot verify this myself as I do not have Windows or the source code for Python lying around.
I suspect that if you call
subprocess.Popenlike:you may find that the double quotes are stripped out as part of the documented argument conversion.