I have these tables:
Projects(projectID, CreatedByID) Employees(empID,depID) Departments(depID,OfficeID) Offices(officeID)
CreatedByID is a foreign key for Employees. I have a query that runs for almost every page load.
Is it bad practice to just add a redundant OfficeID column to Projects to eliminate the three joins? Or should I do the following:
SELECT * FROM Projects P JOIN Employees E ON P.CreatedBY = E.EmpID JOIN Departments D ON E.DepID = D.DepID JOIN Offices O ON D.officeID = O.officeID WHERE O.officeID = @SomeOfficeID
In application programming I ‘Write with best practices first and optimize afterwards’, but database administrators are always warning about the cost of joins.
Denormalization has the advantage of fast
SELECTs on large queries.Disadvantages are:
It takes more coding and time to ensure integrity (which is most important in your case)
It’s slower on DML (INSERT/UPDATE/DELETE)
It takes more space
As for optimization, you may optimize either for faster querying or for faster DML (as a rule, these two are antagonists).
Optimizing for faster querying often implies duplicating data, be it denormalization, indices, extra tables of whatever.
In case of indices, the RDBMS does it for you, but in case of denormalization, you’ll need to code it yourself. What if
Departmentmoves to anotherOffice? You’ll need to fix it in three tables instead of one.So, as I can see from the names of your tables, there won’t be millions records there. So you’d better normalize your data, it will be simplier to manage.