I’m working on a web app that uses scraping to harvest it’s data. I have run into a roadblock in that I’m unsure on how to write a regular expression to extract the data I need.
I need to extract the distance and grade from a string like the following.
"The Bet with the Tote 525 (A6) 525y"
The grade is the “A6” and the distance is the “525y”.
Every now and again, the string has another set of brackets in it that need to be ruled out. For example in this string:
"The Bet with the Tote (Starter race) Some more info (A6) 525y"
I will need the second set of brackets. The grade and distance are always appended to the end of the description so will always be at the end of the string.
I have tried simply using substr() to get the number of characters from the end of the string but every now and again, the distance is set to something like “525yH” which completely throws it out. For that reason, I would guess that a regular expression would be the best option.
Any help greatly appreciated.
Dan
Extended Information
- The grade is always a minimum of 2 characters. Maximum of 3.
- The grade does not always consist of a letter and a number.
- Examples of grades:
- “A1” through to “A10”
- “T1” through to “T10”
- “OR”
- A number of other letter/number combinations
- Distance can be in either metres or yards.
- Distance is always a 3 character integer with either “y” or “m” except:
- Sometimes the distance has a H on the end which should be ommited.
If data pattern is fixed, why not use EXPLODE ?