In the documentation of Django is an example pattern for an URL of an article:
(r'^articles/(\d{4})/(\d{2})/(\d+)/$', 'news.views.article_detail'),
So, only 2011/05/23/ will match, but not 2011/5/23/
In another part of the docs, where the permalinks decorator is explained, the pattern is
(r'/archive/(?P<year>\d{4})/(?P<month>\d{1,2})/(?P<day>\d{1,2})/$', archive_view)
And the code for creating a permalink
@models.permalink
def get_absolute_url(self):
return ('archive_view', (), {
'year': self.created.year,
'month': self.created.month,
'day': self.created.day})
In particular, month has changed from \d{2} to \d{1,2}, so 2011/05/23/ and 2011/5/23/ will now both match; the get_absolute_url method will create the second link, without leading zero.
To create a permalink for the first regex (\d{2}), I could write str(self.created.month).zfill(2) in the method, but this seems a bit cumbersome and too redundant (if I change the URLconf, I will need to change the get_absolute_url method, too) to me.
Additionally, we have now multiple urls which all display the same content (2011/05/03/, 2011/5/03/, 2011/05/3/, etc.), could that be a problem, e.g. for search engines? At least it can result in inconsistent urls.
Is there a (simple) way to redirect all urls to the zero filled ones (2011/5/3/ › 2011/05/03/) and also automatically always build them zerofilled, so I don’t need to mess around in methods like get_absolute_url with str() and zfill and can just pass the number?
There’s no way that I know of to get Django to automatically zero fill numbers passed in as parameters to a URL, other than the way you are already doing it.
You could relax the regex to not require the zero, as you described, which would create a duplicate content issue. However, @Matt fails to consider that the content has to actually be exposed at both URLs for search engines to consider it duplicate. More likely than not, all URLs on your site would be composed from either
reverse(ormodels.permalinkdecorator aroundget_absolute_url) or the{% url %}template tag. Therefore, all the URLs would be the same format, i.e. without zeros, and the zero-version would never even been seen by search engines.Additionally, you can make use of the canonical tag to let search engines know that the content is not duplicate, but merely available through multiple URLs.
So search engines are nothing to be concerned about.