How would I define a regex that just gets me a pattern like
text HTMLTag text HTMLTag text HTMLTag ……..
Basically a unit patter is ‘text HTMLTag’ which can be fetched using $1 and $2 .
An exmaple data would be
abarelixx is a sample data for spellchecking<img src="Randomz" alt="Randomz Image">Randomz is the name of the image</img>Bigboss<img src="Randomz" alt="Randomz Image">Randomz is the name of the image</img>this is another text string
This need to be broken down to text HTMLTag …and if there is no text/HTMLTag , it should return “” .
I found a decent solution for this problem . Append a ‘>’ to the beginning and ‘<‘ to the end . And then use a patter like re = /([>])([^<]+)([<])/g $2 would be all the text contents .
And you can ofcourse use a normal HTML pattern to get the HTML tags .