I’ve heard/read the suggestion that joins on integer columns are more efficient than joins on varchar columns. Sometimes I’ve heard this qualified to: “joins on integer columns are more efficient than joins on long varchar columns.”
can anyone comment on whether either statement is true, and if so, some of the underlying reasons?
any articles or references are welcome and appreciated. thanks!
I’m not familiar with Postgresql, but I would expect this to be true on any database for the simple reason that comparison of integers is far more efficient than comparison of strings.
To do a join the database needs to search the index on the key field. Searching an integer index has to be quicker than searching a string index. Not only is there less data involved, the comparison can be performed lightning quick in a single CPU operation rather than some probably complicated string comparison that makes use of case sensitivity and localisation logic.
This is assuming by ‘more efficient’ you mean has microsecond speed advantages over. Of course there might be architectural considerations that mean creating a join on a string is overall a better thing to do for the database design. But generally I keep clear of joining on anything other than integers.