I’m looking for the most elegant way to parse this. I’m just hitting a wall when it comes to my regex knowledge and maybe a regex is not even the best answer?
I have three example sentences to give an example of what I want to do. I want to to parse these into four parts. Attacker, attack-type, damage and target.
Gandalfs’s heavenly wrath DISMEMBERS you!
The Holy Prelate’s slash wounds Frodo.
Your divine power decimates the evil Warlock!
Attacker:
One or several words that are always first and the words can be identified by either being "Your" or end in ‘s.
Attack-type: One or several words that can only be identified by that they are between the "attacker" and the "damage".
Damage: One or more (rare but exists) words that are unique and limited. I have a list with possible words. {"wounds", "decimates" etc}. They do not exists anywhere else so no risk that the attacker is named "wounds" or something like that.
Target: One or several words that can be identified that they are all the words after the damage.
The following regex will return a match with four captures for each line:
Note that you need to use the following regex options for it to work:
You can then query the value of the groups (attacker, type, damage, target) for each match.
Note that you need to complete the list of damages.
My regex test application set to process all matches returns the following for your test data and my regex: