I was wondering how I would best validate URLs in Rails. I was thinking of using a regular expression, but am not sure if this is the best practice.
And, if I were to use a regex, could someone suggest one to me? I am still new to Regex.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Validating an URL is a tricky job. It’s also a very broad request.
What do you want to do, exactly? Do you want to validate the format of the URL, the existence, or what? There are several possibilities, depending on what you want to do.
A regular expression can validate the format of the URL. But even a complex regular expression cannot ensure you are dealing with a valid URL.
For instance, if you take a simple regular expression, it will probably reject the following host
but it will allow
that is a valid host, but not a valid domain if you consider the existing TLDs. Indeed, the solution would work if you want to validate the hostname, not the domain because the following one is a valid hostname
as well the following one
Now, let me give you some solutions.
If you want to validate a domain, then you need to forget about regular expressions. The best solution available at the moment is the Public Suffix List, a list maintained by Mozilla. I created a Ruby library to parse and validate domains against the Public Suffix List, and it’s called PublicSuffix.
If you want to validate the format of an URI/URL, then you might want to use regular expressions. Instead of searching for one, use the built-in Ruby
URI.parsemethod.You can even decide to make it more restrictive. For instance, if you want the URL to be an HTTP/HTTPS URL, then you can make the validation more accurate.
Of course, there are tons of improvements you can apply to this method, including checking for a path or a scheme.
Last but not least, you can also package this code into a validator:
Note for newer URI versions(i.e 0.12.1)
.present?/.blank?would be a more accurate way to validate hosts, instead of usinguri.host.nil?or justif uri.hostpreviously (i.e. URI v 0.11).Example for URI.parse("https:///394"):
hostwill return an empty string, and/394becomes a path. #<URI::HTTPS https:///394>hostwill return an empty string, and/394becomes a path too. #<URI::HTTPS https:/394>