I have searched for a solution for this problem, but haven’t found it (yet), probably because I don’t quite know how to explain it properly myself. If it is posted somewhere already, please let me know.
What I have is three databases that are related to each other; main, pieces & groups. Basically, the main database contains the most elementary/ most used information from a post and the pieces database contains data that is associated with that post. The groups database contains all of the (long) names of the groups a post in the main database can be ‘posted in’. A post can be posted in multiple groups simultaneously. When a new post is added to my site, I check the pieces too see if there are any duplicates (check if the post has been posted already). In order to make the search for duplicates more effective, I only check the pieces that are posted in the same group(s).
Hopefully you’re still with me, cause here’s where it starts to get really confusing I think (let me know if I need to specify things more clearly): right now, both the main and the pieces database contain the full name of the group(s) (basically I’m not using the groups database at all). What I want to do is replace the names of those groups with their associated IDs from the groups database. For example, I want to change this:
from:
MAIN_table:
id | group_posted_in
——–|—————————
1 | group_1, group_5
2 | group_15, group_75
3 | group_1, group_215
GROUPS_table:
id | group_name
——–|—————————
1 | group_1
2 | group_2
3 | group_3
etc…
into:
MAIN_table:
id | group_posted_in
——–|—————————
1 | 1,5
2 | 15,75
3 | 1,215
Or something similar to this. However, This format specifically causes issues as the following query will return all of the rows (from the example), instead of just the one I need:
SELECT * FROM main_table WHERE group = '5'
I either have to change the query to something like this:
...WHERE group = '5' OR group = '5,%' OR group = '%,5,%' OR group = '%,5'
Or I have to change the database structure from Comma Separated Values to something like this: [15][75]. The accompanying query would be simpler, but it somehow seems like a cumbersome solution to me. Additionally, (simple) joins will not be easy/ possible at all. It will always require me to run a separate query to fetch the names of the groups–whether a user searches for posts in a specific group (in which case, I first have to run a query to fetch the id’s, then to search for the associated posts), or whether it is to display them (first the posts, then another query to match the groups).
So, in conclusion: I suppose I know there is a solution to this problem, but my gut tells me that it is not the right/ best way to do it. So, I suppose the question that ties this post together is:
What is the correct method to connect the group database to the others?
For a many-to-many relationship, you need to create a joining table. Rather than storing a list of groups in a single column, you should split that column out into multiple rows in a separate table. This will allow you to perform set based functions on them and will significantly speed up the database, as well as making it more robust and error proof.
So, for MainID 1, you would have GroupsInMain records:
This associates groups 1 and 5 with MainID 1
FK in this case means a Foreign Key (i.e. a reference to a primary key in another table). You’d probably also want to add a unique constraint to GroupsInMain on MainID and GroupID, since you’d never want the same values for the pairing to show up more than once.
Your query would then be: