This might be deleted, since involves idea sharing which is not quite allowed in stack overflow, but still before that if I could get any ideas from solid programmers, it will be a win situation for me
Assume that you have a class Student, stored in the database, and this class has a list property called favoriteTeachers. This list constantly gets updated by the system and involves the id of teachers.
You also have a class Teacher, also stored in database and likewise has a list property favouriteStudents. It is again updated constantly and involves the id’s of students.
In our system, when a student calls a function (let’s say notMyFavoriteTeacher), our system has to apply the changes below;
- Delete the given teacher’s id from favouriteTeacher list
- Delete the student’s id from given teacher’s favouriteStudent list
I’ve tried to consider the number of rows updated could exhaust the database so instead of mapping the students with their favorite teachers in a separate table as user_id, teacher_id, instead I created a column and stored a string which contains the teachers id’s separated by comma. (Ex: “1,2,14,4,25”). Same applied for the teacher as well.
However when we call this function, we also face another problem. In order for this operation to be done, you need to convert the string to list, find the element by linear search and later on delete, and later on convert list to string and push back to db. And you have to do the other operation for the teacher class as well. If we did not apply the string method, deletion would be easier but since we would be handling deletion and addition operations for like 2k times a day, i did not think it would be feasible to use separate tables.
I wanted to ask in order to decrease the number of operations, could a data structure be chosen such that it would increase the efficiency?
Storing an relation as an array in a single column is a violation of first normal form, and should not be done without good reason. Although various forms of denormalization may result in increased efficiency in some cases, I don’t see this case being one of those. What’s worse, you’ll get no help from the database in enforcing referential integrity. And some operations will result in guaranteed row scans: When deleting a teacher, you will have to examine every row of every student to remove the teacher from each student’s favorite list. Same goes for deleting a student.
Relational Databases are designed and built to link rows to other rows. You need a very good reason to keep them from doing what they’re design to do. You should go ahead and design a proper relational schema, and only if actual measurement shows that it is too slow should you worry about its performance.