I’m about to setup permalinks on a website and I’m wondering what sort of conventions I should follow in regards to generating them from the existing table of news articles.
So far here are the rules/steps I’ve come up with:
- Create a dump file of my live database, set it up locally
- Add a new ‘permalink’ column to my news_articles table with
ALTER table ADD column slug VARCHAR(100)( Not sure on the data type or length, suggestions? ) - Update every row in my news_articles table, such that
the permalink column by default is replaced by the title, after the title goes through some formatting:
Disclaimer: the titles are Russian
Cyrillic and not English so I don’t
think I can get away with an easy
regex replacement, unless perhaps I
rely on unicode code ranges except I’m
not familiar with the range of the
Cyrillic characters so I’ll do manual
replacement and the rules I’ve defined
so far are:
- replace all spaces with an underscore
- strip off leading/trailing commas and periods
- remove all single/double quotes
- remove all question marks
- & becomes the russian word for ‘and’
I’d appreciate any advice on my current strategy before I actually execute it, as there are thousands of articles.
Some comments/suggestions
Update:
* convert all characters to lower case (kinda subjective but it’s a common practice and more readable then all upper case)
One more suggestion, this is a trick which I used to check if my url path or slug do not contain any urlencoded string. After formulated the slug string, I will perform an urlencode and compare with the pre-encoded one. They should be the same.
Example for this post, if the slug happened to be “
permalink-slug:best-practices“, the urlencoded one will be “permalink-slug%3Abest-practices“, not really readable and defeats the purpose of slug which you will know/agree. So it’s really easy to know the orginal string don’t work because it is different after urlencoding.