I made an article spinner that used regex to find words in this syntax:
{word1|word2}
And then split them up at the “|”, but I need a way to make it support tier 2 brackets, such as:
{{word1|word2}|{word3|word4}}
What my code does when presented with such a line, is take “{{word1|word2}” and “{word3|word4}”, and this is not as intended.
What I want is when presented with such a line, my code breaks it up as “{word1|word2}|{word3|word4}”, so that I can use this with the original function and break it into the actual words.
I am using c#.
Here is the pseudo code of how it might look like:
Check string for regex match to "{{word1|word2}|{word3|word4}}" pattern
If found, store each one as "{word1|word2}|{word3|word4}" in MatchCollection (mc1)
Split the word at the "|" but not the one inside the brackets, and select a random one (aka, "{word1|word2}" or "{word3|word4}")
Store the new results aka "{word1|word2}" and "{word3|word4}" in a new MatchCollection (mc2)
Now search the string again, this time looking for "{word1|word2}" only and ignore the double "{{" "}}"
Store these in mc2.
I can not split these up normally
Here is the regex I use to search for “{word1|word2}”:
Regex regexObj = new Regex(@"\{.*?\}", RegexOptions.Singleline);
MatchCollection m = regexObj.Matches(originalText); //How I store them
Hopefully someone can help, thanks!
Edit: I solved this using a recursive method. I was building an article spinner btw.
That is not parsable using a regular expression, instead you have to use a recursive descent parser. Map it to JSON by replacing:
{with[|with,wordXwith"wordX"(regex \w+)Then your input
becomes valid JSON
and will map directly to PHP arrays when you call
json_decode.In C#, the same should be possible with
JavaScriptSerializer.