I’ve got a database that stores hash values and a few pieces of data about the hash, all in one table. One of the fields is ‘job_id’, which is the ID for the job that the hash came from.
The problem I’m trying to solve is that with this design, a hash can only belong to one job – in reality a hash can occur in many jobs, and I’d like to know each job in which a hash occurs.
The way I’m thinking of doing this is to create a new table called ‘Jobs’, with fields ‘job_id’, ‘job_name’ and ‘hash_value’. When a new batch of data is inserted into the DB, the job ID and name would be created here and each hash would go into here as well as the original hash table, but in the Jobs table it’d also be stored against the job.
I don’t like this, because I’d be duplicating the hash column across tables. Is there a better way? I can add to the hash table but can’t take away any columns because closed-source software depends on it. The hash value is the primary key. It’s MySQL and the database stores many millions of records. Thanks in advance!
Adding the new
jobtable is the way to go. It’s the normative practice, for representing a one-to-many relationship.It’s good to avoid unnecessary duplication of values. But in this case, you aren’t really “duplicating” the
hash_valuecolumn; rather, you are really defining a relationship betweenjoband the table that hashash_valueas the primary key.The relationship is implemented by adding a column to the child table; that column holds the primary key value from the parent table. Typically, we add a FOREIGN KEY constraint on the column as well.