I’m learning/practicing Regex. I’ve written this to test for url’s…I want it to catch url’s in these formats:
www.site.com
www.site.co.uk etc
site.com
play.site.com
So I’ve written this:
(http:\/\/)*(www)*\.*(\w{2,})(\.{1})(\w{2,3})(\.*)(\w{2,3})*
(match http:// 0 or more times, followed by some more characters 0 or more times, followed by a domain name, followed by a period, followed by some more characters (at least 2, max 3), then followed by an optional period and some more chars (for co.uk etc).)
I’m very new to regex so not sure if there’s problems with what i’ve done but it seems to work well in testing here: http://regexpal.com/ . Feel free to rip it apart!
The one thing I’ve noticed is that it does match .site.com which I don’t want. How can I match just site.com and still allow for http:// and www and subdomains?
Put the “.” in the “www” match; that will resolve the problem of potentially having a leading dot matching in your url.