I need a regular expression that can be used with replaceall to replace all the html tags with empty string except any variations of br to maintain the line breaks.
I found the following to replace all html tags
<\s*br\s*\[^>]
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
You might get some answers that claim to work.
Those answers might even work for the particular cases you try them against.
But know that regular expressions (which I’m fond of in general) are the wrong tool for the job in this case.
And as your project evolves and needs to cover more complex HTML inputs, the regular expression will get more and more convoluted, and there may well come a time when it simply cannot solve your problem anymore, period.
Do it the right way from the beginning. Use an HTML parser, not a regex.
For reference, here are some related SO posts: