I have a string with some HTML code in, for example: This is <strong

Question

0

Asked: May 25, 20262026-05-25T18:43:23+00:00 2026-05-25T18:43:23+00:00

I have a string with some HTML code in, for example: This is <strong

0

I have a string with some HTML code in, for example:

This is <strong id="c1-id-8">some</strong> <em id="c1-id-9">text</em>

I need to strip out the id attribute from every HTML tag, but I have zero experience with regular expressions, so I searched here and there from the internet and I wrote this pattern: [\s]+id=\".*\"

Unfortunately it’s not working as I would expect. Infact, I was hoping that the regular expression would catch the id=" followed by any character repeated for any number of times and terminated with the nearest double quote; Practically in this example I was expecting to catch id="c1-id-8" and id="c1-id-9".
But instead the pattern returned me the substring id="c1-id-8">some</strong> <em id="c1-id-9", it finds the first occurrence of id=" and the last occurrence of a double quote character.

Could you tell me what is wrong in my pattern and how to fix it, please?
Thank you very much

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T18:43:24+00:00

The quantifier .* in your regex is greedy (meaning it matches as much as it can). In order to match the minimum required you could use something like /\s+id=\"[^\"]*\"/. The brackets [] indicate a character class. So it will match everything inside of the brackets. The carat [^] at the beginning of your character class is a negation, meaning it will match everything except what is specified in the brackets.

An alternative would be to tell the .* quantifier to be lazy by changing it to .*? which will match as little as it can.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a string with some HTML code in, for example: This is <strong

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply