I have a JS function which is passed a string that a RegEx is run against, and returns any matches:
searchText= // some string which may or may not contain URLs
Rxp= new RegExp("([a-zA-Z\d]+://)?(\w+:\w+@)?([a-zA-Z\d.-]+\.[A-Za-z]{2,4})(:\d+)?(/.*)?/ig")
return searchText.match(Rxp);
The RegExp should return matches for any of the following (and similar derivations):
- google.com
- http://www.google.com
- http://www.google.com
- http://google.com
- google.com?querystring=value
- http://www.google.com?querystring=value
- http://www.google.com?querystring=value
- http://google.com?querystring=value
However, no such luck. Any suggestions?
In a string,
\has to be escaped:\\.First, the string is interpreted.
\wturns inw, because it has no significant meaning.Then, the parsed string is turned in a RegEx. But
\is lost during the string parsing, so your RegEx breaks.Instead of using the
RegExpconstructor, use RegEx literals:If you’re not 100% sure that the input is a string, it’s better to use the
execmethod, which coerces the argument to a string:Here’s a pattern which includes the query string and URL fragment: