I want to extract a string from another using JavaScript / RegExp.
Here is what I got:
var string = "wp-button wp-image-45 wp-label";
string.match(/(?:(?:.*)?\s+)?(wp-image-([0-9]+))(:?\s(?:.*)?)?/);
// returnes: ["wp-button ", "wp-image-45", "45", undefined]
I just want to have “wp-image-45”, so:
- (Optional) any character
- (Optional) followed by whitespace
- (Required) followed by “wp-image-“
- (Required) followed by any number
- (Optional) followed by whitespacy
- (Optional) followed by any character
What is missing here? Is it just some kind of bracketing or more?
I also tried
string.match(/(?:(?:.*)?\s+)?(?=(wp-image-([0-9]+)))(?=(:?\s(?:.*)?)?)/)
Edit: In the end I just want to have the number. But I’d also make this step in between.
Regexps are not required to start matching at the beginning of the string, so your attempts to match whitespace and any character aren’t necessary. Also, “any character” includes whitespace (except newlines in certain modes).
This should be all you need:
This will capture, for example, “wp-image-123” into matching group 0, and “123” into matching group 1.
\bmeans “word boundary”, which ensures that you won’t match “abcwp-image-123def”. A word boundary is defined as any place where a non-word character is followed by a word character, or vice versa. A word character is consists of a letter, a number or an underscore.Also, I used
\dinstead of[0-9]simply out of convenience. They have slightly different meaning (\dalso matches characters considered numbers in other languages), but that won’t make a difference in your case.