I’ve written a regex to help validate a String for game character names. It’s somehow passing seemingly invalid strings and not passing seemingly valid strings.
Requirements:
- Starts with a capital letter
- Has any number of alphanumeric characters after that (this includes spaces)
This is the rails code that does the validation in the Character Model:
validates :name, format: { with: %r{[A-Z][a-zA-Z0-9\s]*} }
Here’s the unit test I’m using
test "character name should be properly formatted and does not contain any special characters" do
character = get_valid_character
assert character.valid?
character.name = "aBcd"
assert character.invalid?, "#{character.name} should be invalid"
character.name = "Number 1"
assert character.valid?, "#{character.name} should be valid"
character.name = "McDonalds"
assert character.valid?, "#{character.name} should be valid"
character.name = "Abcd."
assert character.invalid?, "#{character.name} should be invalid"
character.name = "Abcd%"
assert character.invalid?, "#{character.name} should be invalid"
end
The problems:
The regex passes “aBcd”, “Abcd.”, and “Abcd%” when it shouldn’t. Now, I know this works because I tested this out in Python and it works just as you would expect.
What gives?
Thank you for your help!
Regular expressions look for matches anywhere in the given string unless told otherwise.
So the test string
'aBcd'is invalid, but it contains a valid substring:'Bcd'. Same with'Abcd%', where the valid substring is'Abcd'.If you want to match the entire string, use this as your regex:
PS: Some people will say to match the beginning of a string with
^and the end with$. In Ruby, those symbols match the beginning and end of a line, not a string. So"ABCD\n%"would still match if you used^and$, but won’t match if you use\Aand\z. See the Rails security guide for more on this.