I have a file that is structured in a large multidimensional structure, similar to json, but not close enough for me to use a json library.
The data looks something like this:
alpha {
beta {
charlie;
}
delta;
}
echo;
foxtrot {
golf;
hotel;
}
The regex I am trying to build (for a preg_match_all) should match each top level parent (delimited by {} braces) so that I can recurse through the matches, building up a multidimensional php array that represents the data.
The first regex I tried is /(?<=\{).*(?=\})/s which greedily matches content inside braces, however this isn’t quite right as when there is more than one sibling in the top level the match is too greedy. Example below:
Using regex /(?<=\{).*(?=\})/s match is given as:
Match 1:
beta {
charlie;
}
delta;
}
echo;
foxtrot {
golf;
hotel;
Instead the result should be:
Match 1:
beta {
charlie;
}
delta;
Match 2:
golf;
hotel;
So regex wizards, what function am I missing here or do I need to solve this with php somehow? Any tips very welcome 🙂
You can’t 1 do this with regular expressions.
Alternatively, if you want to match deep-to-shallow blocks, you can use
\{[^\{\}]*?\}andpreg_replace_callback()to store the value, and returnnullto erase it from the string. The callback will need to take care of nesting the value accordingly.Incomplete, not tested, and no warranty.
This approach requires that the string be wrapped in
{}as well, otherwise the final match won’t happen and you’ll loop forever.This is an awful lot of (inefficient) work for something that can just as easily be solved with a well known exchange/storage format such as JSON.
1 I was going to put “you can, but…“, however I’ll just say once again, “
You can’t” 22 Don’t