for example a website where users can vote on questions (up and down) with this model:
users(id)
questions(id)
votes(id,userId,questionid,vote)
-- vote can be +1 or -1, or probably better bit 0 and 1
I’m thinking that when there are going to be lots of questions with lots of votes, there are going to be performance issues, especially when showing the list of questions, so
would it make sense to add a column to the questions like this:
questions(id, votessum)
and each time somebody votes besides doing an insert into the votes questions also to update the questions and set it’s votessum column
I would say that no, out of the gate I would not have such a column on the question table. I would determine via actual or estiamted usage plus performance testing whether the queries required to aggregate this data would become a performance bottleneck.
If the query did create a bottleneck, I would first look at other optimizations, like improving indexes and materialized views before doing this kind of denormalization.
Denormalization is sometime useful (and sometimes necessary), but it has its own problems and pitfalls that make fully exploring other options worthwhile.