I have a MySQL database with 4 tables:
job
job_application
client
candidate
Each table has it’s own primary key, i.e job_id, job_application_id, client_id, candidate_id
Employers in the client table can post jobs in the job table. The job table contains a client_id field which identifies the client
Candidates in the candidate table can apply for a job, inserting a row in to the job_application table. The job_application table contains a job_id field and a candidate_id field to identify what the job is and who applied for it
I’ve run in to a bit of a problem writing up the queries for Employers to manage the job applications they receive. As an example here is a function I wrote that deletes rows from job_application
public function deleteJobApplications($job_application_ids) {
$this->db->query("DELETE ja.* FROM " . DB_PREFIX . "job_application ja LEFT JOIN " . DB_PREFIX . "job j ON (j.job_id = ja.job_id) WHERE ja.job_application_id IN ('" . implode("','", array_map('intval', $job_application_ids)) . "') AND j.client_id = '" . (int)$this->client->getClientId() . "'");
}
Because the client_id is only referenced in the job table, I need to LEFT JOIN the job table every time I want to UPDATE or DELETE from the job_application table
Should I add another client_id field to the job_application table, essentially duplicating data already held in the database, or continue with the LEFT JOIN for every UPDATE and DELETE?
Your problem isn’t that you need to denormalize “job_applications” by introducing the “client_id” as a redundant column. (The currently accepted answer is factually incorrect in that regard.) Your problem is that you didn’t normalize correctly in the first place. If you had, the column “client_id” would already be in that table, and your problem would never have arisen in the first place.
Let’s pretend that candidate names, client names, and job names are globally unique.
A table that looks like this will satisfy the predicate Person named “candidate_name” applies for “job_name” at company “client_name”.
Three columns, no id numbers, no nulls, no nonprime attributes, all key. This relation is in 6NF.
It should be obvious that you could create a table for jobs (or job offers) by selecting distinct values from the first two columns. The foreign key reference is obvious.
In a similar way, you can select distinct values from the first column alone for a set of companies, and from the last column alone for a set of applicants. Again, the foreign key references should be obvious.
All those tables are in 6NF.
Augmenting a table with a surrogate key in addition to its natural keys doesn’t change the normal form when you do it correctly. Let’s replace the natural keys in “job_applications” with your surrogate ID numbers. Making that replacement will result in your table looking like this. (In practice, you’d do the same thing in the other tables, too.)
Note that client_id is already in there. If there are no other columns, you’re still in at least 5NF.