I’m using PHP 5’s preg functions, if it makes a difference.
Consider the regular language matched by the following regular expression.
([^{}] | {[0-9a-zA-Z_]+})*
The language consists of strings of any number of characters, with special embedded tags marked off by left and right curly brackets, which contain a string of one or more alphanumeric or underscore characters. For example, the following is a valid string in the language:
asdfasdf 1243#$*#{A_123}asdf?{432U}
However, while validating a string with this regex, I would like to get a list of these curly-bracket-delimited tags and their positions in the string. Considering the previous example string, I’d like to have an array that tells me:
A_123: 20; 432U: 32
Is this possible with regular expressions? Or should I just write a function “by hand” without regexp that goes through every character of the string and parses out the data I need?
Forgive me if this is an elementary question; I’m just learning!
To capture the offsets, you can set the
PREG_OFFSET_CAPTUREflag.http://php.net/manual/en/function.preg-match.php
You can run the following script yourself and see the results:
On the pattern:
\wis a escape sequence equivalent to the following character class:[a-zA-Z0-9_](...)are used for grouping and they also create backreferences.+is a quantifier that means “one or more” of the previous patternA good resource on regex: http://www.regular-expressions.info