Team:
I need some help with some regular expressions. The goal is to be able to identify three different ways that users might express links in a note, and those are as follows.
<a href="http://www.msn.com">MSN</a>
possibilities
http://www.msn.com OR
https://www.msn.com OR
www.msn.com
Then by being able to find them I can change each one of them to real A tags as necessary. I realize the first example is already an A tag but I need to add some attributes to it specific to our application — such as TARGET and ONCLICK.
Now, I have regular expressions that can find each one of those individually, and those are as follows, respective to the examples above.
<a?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*)/?>
(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?
[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?
But the problem is that I can’t run all of them on the string because the second one will match a part of the first one and the third one will match a part of both the first and second. At any rate — I need to be able to find the three permutations distinctly so I can replace each one of them individually — because the third expression for example will need http:// added to it.
I look forward to everybodys assistance!
Assuming that the link starts or ends either with a space or at beginnd/end of line (or inside an existing
Atag) I came up with the following code, which also includes some sample texts:As this code uses groups with numbers it should be possible to use the regular expression in JavaScript too.
Depending on what you need to do with the existing
Atag you need to parse the particular first group as well.Update:
Modified the regex as requested so that the link Text becomes group no. 4
Update 2:
To better catch malformed links you might try this modified version: