I’m having troubles figuring out a python regex for django urls. I have a certain criteria, but can’t seem to come up with the magic formula. In the end its so I can identify which page is a CMS page and pass to the django function the alias url it should load.
Here are some examples of valid strings which would match:
- about-us
- contact-us
- terms-and-conditions
- info/learn-more-pg2
- info/my-example-url
Criteria:
- Must be all lowercase
- Must contain a dash “-“
- Can contain numbers, letters and a slash “/”
- Must be at least 4 characters long and a max of 30 characters
- Cannot contain special characters
- Cannot contain the words:
- .jpg
- .gif
- .png
- .css
- .js
Examples which should not match:
- About-Us (has upper case)
- contactus (doesn’t have a dash)
- pg (less than 4 characters)
- img/bg.gif (contains “.gif”)
- files/my-styles.css (contains “.css”)
- my-page@ (has a character other than letters, numbers, dash or slash)
I know this isn’t even close yet, but this is as far as I’ve gotten:
(?P<alias>([a-z/-]{4,30}))
I apologize for having large requirements, but I just can’t get my head wrapped around this regex stuff.
Thanks!
I’m puzzled as to why several of the commentators find that this is hard to do in a regex. This is exactly what regular expressions are good at.
It is true however that since the
.isn’t among the allowed characters, the check for the forbidden file extensions is not really necessary.