I need to design a database for something like a downloads site . I want to keep track of users , the programs each users downloaded and also allow users to rate+comment said programs.The things I need from this database – get average rating for a program , get all comments for a program , know exactly what program was downloaded by whom(I dont care how many times each program was downloaded but I want to know for each users what programs he has downloaded),maybe also count number of comments for each program and thats about it(it’s a very small project for personal use that I want to keep simple)
I come up with these entities –
User(uid,uname etc)
Program(pid,pname)
And the following relationships-
UserDownloadedProgram(uid,pid,timestamp)
UserCommentedOnProgram(uid,pid,commentText,timestamp)
UserRatedProgram(uid,pid,rating)
Why I chose it this way – the relationships (user downloads , user comments and rates) are many to many . A user downloads many programs and a program is downloaded by many users. Same goes for the comments (A user comments on many programs and a program is commented or rated by many users). The best practice as far as I know is to create a third table which is one to many (a relationship table).
. I suppose that in this design the average rating and comment retrieval is done by join queries or something similar.
I’m a total noob in database design but I try to adhere to best practices , is this design more or less ok or am I overlooking something ?
I can definitely think of other possibilities – maybe comment and\or rating can be an entity(table) by itself and the relationships are between 3 entities. I’m not really sure what the benefits\drawbacks of that are: I know that I don’t really care about the comments or the ratings , I only want to display them where appropriate and maintain them(delete when needed) , so how do I know if they better become an entity themselves?
Any thoughts?
You would create new entities as dictated by the rules of normalization. There is no particular reason to make an additional (separate) table for comments because you already have one. Who made the comment and which program the comment applied to are full-fledged attributes of a comment. The foreign keys representing these relationships (which are many-to-one, from the perspective of the comment table) belong right where you’ve put them.
The tables you’ve proposed are in third normal form which is acceptable according to best practices. I would add that you seem to be tracking data on a transactional basis (i.e. recording events as and when they occur). That is a good practice too because you can always figure out whatever you want to based on detailed information.
Calculating number of downloads or number of comments is a simple matter of using SQL Aggregate Functions with filters on the foreign key(s) that apply to your query – e.g.
where pid=1234etc.