The input is a string representing a list of elements.
A list is defined as an open curly { followed by 0 or more elements separated by whitespace followed by a closed curly }.
An element is either a literal or a list of elements.
A literal is a succession of non-whitespace characters. If an element contains a curly bracket, it must be escaped with a backslash : \{ and \}. (Or you could assume curlies are not allowed inside literals, for simplicity)
Example:
"{abc { def ghi } 7 { 1 {2} {3 4} } {5 6} x\{yz \}foo }"
No curlies inside literals:
"{abc { def ghi } 7 { 1 {2} {3 4} } {5 6} xyz foo }"
(This is a simplified definition of a Tcl list.)
What I want to know is: can the input be split into the elements of the outermost loop using regex?
Expected output:
abc
{ def ghi }
7
{ 1 {2} {3 4} }
{5 6}
x{yz
}foo
The real question is: can this be done with a Regex?
I’m most interested in the .NET flavour, but will accept any answers.
I’ll post my own assumption in an answer, and see if it’s validated or destroyed.
Well, the edit removes curly braces from tokens and takes the sting from the question, and now it is easily doable with .Net Regexes, using balancing groups. It is simply matching braces, which is a basic example.
Much like KennyTM’s answer, this will only work if you remove the top level braces, or it will match the whole input.
Again, this is better used for recreational purposes:
For much more details see this article: Regex Balancing Group in Depth