I would like to split a string in PHP containing quoted and unquoted substrings.
Let’s say I have the following string:
"this is a string" cat dog "cow"
The splitted array should look like this:
array (
[0] => "this is a string"
[1] => "cat"
[2] => "dog"
[3] => "cow"
)
I’m struggling a bit with regex and I’m wondering if it is even possible to achieve with just one regex/preg_split-Call…
The first thing I tried was:
[[:blank:]]*(?=(?:[^"]*"[^"]*")*[^"]*$)[[:blank:]]*
But this splits only array[0] and array[3] correctly – the rest is splitted on a per character base.
Then I found this link:
PHP preg_split with two delimiters unless a delimiter is within quotes
(?=(?:[^"]*"[^"]*")*[^"]*$)
This seems to me as a good startingpoint. However the result in my example is the same as with the first regex.
I tried combining both – first the one for quoted strings and then a second sub-regex which should ommit quoted string (therefore the [^”]):
(?=(?:[^"]*"[^"]*")*[^"]*$)|[[:blank:]]*([^"].*[^"])[[:blank:]]*
Therefore 2 questions:
- Is it even possible to achieve what I want with just one regex/preg_split-Call?
- If yes, I would appreciate a hint on how to assemble the regex correctly
Since matches cannot overlap, you could use
preg_match_alllike this:Now
$matches[0]should contain what you are looking for. The regex will first try to match a quoted string, and then stop. If that doesn’t do it it will just collect as many non-whitespace characters as possible. Since alternations are tried from left to right, the quoted version takes precedence.EDIT: This will not get rid of the quotes though. To do this, you could use capturing groups:
Now
$matches[1]will contain exactly what you are looking for. The(?|is there so that both capturing groups end up at the same index.EDIT 2: Since you were asking for a
preg_splitsolution, that is also possible. We can use a lookahead, that asserts that the space is followed by an even number of quotes (up until the end of the string):Of course, this will not get rid of the quotes, but that can easily be done in a separate step.