I have the following string:
$string = "The man has {NUM_DOGS} dogs."
I’m parsing this by running it through the following function:
function parse_text($string)
{
global $num_dogs;
$string = str_replace('{NUM_DOGS}', $num_dogs, $string);
return $string;
}
parse_text($string);
Where $num_dogs is a preset variable. Depending on $num_dogs, this could return any of the following strings:
- The man has 1 dogs.
- The man has 2 dogs.
- The man has 500 dogs.
The problem is that in the case that “the man has 1 dogs”, dog is pluralised, which is undesired. I know that this could be solved simply by not using the parse_text function and instead doing something like:
if($num_dogs = 1){
$string = "The man has 1 dog.";
}else{
$string = "The man has $num_dogs dogs.";
}
But in my application I’m parsing more than just {NUM_DOGS} and it’d take a lot of lines to write all the conditions.
I need a shorthand way which I can write into the initial $string which I can run through a parser, which ideally wouldn’t limit me to just two true/false possibilities.
For example, let
$string = 'The man has {NUM_DOGS} [{NUM_DOGS}|0=>"dogs",1=>"dog called fred",2=>"dogs called fred and harry",3=>"dogs called fred, harry and buster"].';
Is it clear what’s happened at the end? I’ve attempted to initiate the creation of an array using the part inside the square brackets that’s after the vertical bar, then compare the key of the new array with the parsed value of {NUM_DOGS} (which by now will be the $num_dogs variable at the left of the vertical bar), and return the value of the array entry with that key.
If that’s not totally confusing, is it possible using the preg_* functions?
The premise of your question is that you want to match a specific pattern and then replace it after performing additional processing on the matched text.
Seems like an ideal candidate for
preg_replace_callbackThe regular expressions for capturing matched parenthesis, quotes, braces etc. can become quite complicated, and to do it all with a regular expression is in fact quite inefficient. In fact you’d need to write a proper parser if that’s what you require.
For this question I’m going to assume a limited level of complexity, and tackle it with a two stage parse using regex.
First of all, the most simple regex I can think off for capturing tokens between curly braces.
Lets break that down.
When applied to a string with
preg_match_allthe results look something like:Looks good so far.
Please note that if you have nested braces in your strings, i.e.
{TOK_TWO|0=>"hi {x} y"}, this regex will not work. If this wont be a problem, skip down to the next section.It is possible to do top-level matching, but the only way I have ever been able to do it is via recursion. Most regex veterans will tell you that as soon as you add recursion to a regex, it stops being a regex.
This is where the additional processing complexity kicks in, and with long complicated strings it’s very easy to run out of stack space and crash your program. Use it carefully if you need to use it at all.
The recursive regex taken from one of my other answers and modified a little.
Broken down.
And this time the ouput only matches top-level braces
Again, don’t use the recursive regex unless you have to. (Your system may not even support them if it has an old PCRE library)
With that out of the way we need to work out if the token has options associated with it. Instead of having two fragments to be matched as per your question, I’d recommend keeping the options with the token as per my examples.
{TOKEN|0=>"option"}Lets assume
$matchcontains a matched token, if we check for a pipe|, and take the substring of everything after it we’ll be left with your list of options, again we can use regex to parse them out. (Don’t worry I’ll bring everything together at the end)/(\d)+\s*=>\s*"([^"]*)",?/Broken down.
And an example match
If you want to use quotes inside your quotes, you’ll have to make your own recursive regex for it.
Wrapping up, here’s a working example.
Some initialisation code.
And everything together.
Please note the error checking is minimal, there will be unexpected results if you pick options that don’t exist.
There’s probably a lot simpler way to do all of this, but I just took the idea and ran with it.