I’m trying to design a data model for my current project, and I’m having trouble. The product owner wants to have Users, Searches, Documents, and Tags. Here’s what seems to be well known thus far:
- Users have Searches – (these are not database searches, they are full-text searches that return a bunch of Documents)
- Searches have Documents <- However this need not be modeled. Search results will change over time.
- Users have Documents – any document that they have seen before from any search
- Tags are text strings
- User will be able to mark a Document with any number of Tags, but other Users will not be able to see their Tags
There are two issues I’m trying to manage here:
- Whenever a User makes a Search, I need to mark the returned Documents differently according to whether or not they’ve been seen before. How? Is it reasonable to just collect the Documents from the Search (which will be returned 10 at a time) and then issue another SQL query for all Documents belonging to the User which have one of the doc_ids from the Search? Seems clumbsy.
- Since a User can give a Document several Tags, and since other Users can’t see them, how do I represent this in a database? How do I query it reasonably? I’m thinking that the Documents that belong to a User are represented by a join table – UserDocuments. And then several Tags belongTo UserDocuments. But then how do I present the Tags in-line with the Search results? It seems to me that this involves me issuing a Search, finding which Documents I have seen before, and then finding which of those have Tags and then hitting the database again for each of these Documents and gathering their Tags. This seems extreme, no?
One thing that’s possibly confounding my understanding here is that I’m thinking through the Ruby on Rails ActiveRecord ORM. I will eventually have to implement whatever solution I find within Rails.
One possible design: