Receive a string from somewhere and the string is a sequence of params. Params are separated by whitespace. The task is parse the string to a param list, all params are type string.
For example:
input : "3 45 5.5 a bc"
output : ["3","45","5.5","a","bc"]
Things become little complicated if need to transfer a string include a whitespace, use " to quote.
input: "3 45 5.5 \"This is a sentence.\" bc"
output: ["3","45","5.5","This is a sentence.","bc"]
But what if the sentence happened to include a quote mark? Use the escape character: \" -> ", \\ -> \
input: "3 45 5.5 \"\\\"Yes\\\\No?\\\" it said.\" bc"
output: ['3','45','5.5','"Yes\\NO?" it said.','bc']
Does python have an elegant way to do this job?
PS. I don’t think regular expressions can work this out.
Use the
shlex.split()function:You can create a customizable parser with the
shlex.shlexfunction, then alter it’s behaviour by setting its attributes. For example, you can set the.whitespaceattribute to', \t\r\n'to allow commas to delimit words as well. Then simply convert theshlexinstance back to a list to split the input.