Just wondering if there are a set of design patterns for complex string manipulation?
Basically the problem I am trying to solve is I need to be able to read in a string, like the following:
“[name_of_kicker] looks to make a clearance kick, but is under some real pressure from the [name_of_defending_team] players. He gets a [length_of_kick] kick away, but it drifts into touch on the full.”
or
“[name_of_kicker] receives the ball from [name_of_passer] and launches the bomb. [name_of_kicker] has really made good contact, it’s given a couple of [name_of_attacking_team] chasers ample time to get under the ball as it comes down.”
And replace each "tag" with a possible value and check if the string is equal to another string.
So for example, any tag that represents a player I need to be able to replace with anyone of 22 string values that represent a player. But I also need to be able to make sure I have looped through each combination of players for the various tags, that I may find in a string. NOTE, the tags listed in the above 2 samples, are not the only tags possible, there are countless other ones that could come up in any sentence.
I had tried to create a load of nested for loops to go through the collection of players, etc and attempt to replace the tags each time, but with there being many possibilities of tags I was just creating one nested for loop within another, and it has become unmanageable, and also I suspect inefficient, since I need to loop through over 1,000 base string like the samples above, and replace difference tags with players, etc for each one…
So are there any String manipulation patterns I could look into, or does anyone have any possible solutions to solving a problem like this.
Firstly, to answer your question.
Not really. There are some techniques, but they hardly qualify as design patterns. The two techniques that spring to mind are template expansion and pattern matching.
What you are currently doing / proposing to do is a form of template expansion. However, typical templating engines don’t support the combinatorial expansion that you are trying to do, and as you expect anticipate, it would appear to be an inefficient way to solve your problem.
A better technique would appear to be pattern matching. Let’s take your first example, and turn it into a pattern:
What I’ve done is insert all of the possible alternatives into the pseudo-template, to turn it into a regex. I can now compile this regex to a
java.util.Pattern, and use it to match against your list of other strings.Having said that, if you are trying to do this to “analyse” text, I don’t rate your chances of success. I think you would be better off going down the NLP route.