Using PHP, I am looking to extract an array from a string that contains a numbered list.
Example string:
The main points are: 1. This is point one. 2. This is point two. 3. This is point three.
would result in the following array:
[0] => 1. This is point one.
[1] => 2. This is point two.
[2] => 3. This is point three.
The format of the string can vary – e.g.:
1. This is point one, 2. This is point two, 3. This is point three.
1) This is point one 2) This is point two 3) This is point three
1 This is point one. 2 This is point two. 3 This is point three.
I have started using preg_match_all with the following pattern:
!((\d+)(\s+)?(\.?)(\)?)(-?)(\s+?)(\w+))!
but I am unsure as how to match rest of string/up to the next match.
Example available at RegExr
If your input follows your example input, as in each “point” doesn’t contain a number itself, you could use the following regex:
In PHP, you could use
preg_match_all()to capture everything:This will result in:
Again though, if there are any numbers/digits in the actual points themselves – this won’t work.
If you want actual numbers to appear in each point, you’ll need to define an actual “anchor” or “end” of each point, such as a period. If you can state that a
.will appear only at the end of the point (ignoring the potential one that follows the leading-digit), you could use the following regex:It can be dropped into the
preg_match_all()from above just as easily:Regex explained:
The caveat with the second regex is that a
.may only appear at the end of each point (and following the leading digit). However, I think that this rule may be easier to follow than the “no numbers in the point” rule – it all depends on your actual input though.