I’m trying to parse product names that have multiple abbreviates for sizes. For example, medium can be
m, medium, med
I tried a simple
preg_match('/m|medium|med/i',$prod_name,$matches);
which works fine for ‘product m xyz’. However, when I try ‘product s/m abc’ I’m getting a false-positive match.
I also tried
preg_match('/\bm\b|\bmedium\b|\bmed\b/i',$prod_name,$matches);
to force it to be found in a word, but the m in s/m is still being matched. I’m assuming this is due to the engine treating ‘/’ in the name as a word delimiter?
So to sum up, I need to match ‘m’ in a string, but not ‘s/m’ or ‘small’, etc.. Any help is appreciated.
You can use negative lookbehind or lookahead to exclude the offending separators. This means
"m"/"med"/"medium"which is its own word, but not preceded or followed by a slash or a dash. It also works on the beginning and end of string, since negative lookahead/lookbehind do not force a matching character to be present.If you only want to delimit on whitespace, you can use the positive version:
(
"m"/"med"/"medium"which is preceded by whitespace or the start of the string, and followed by whitespace or the end of the string)