There are microblogging posts, and votes/emoticons associated with them, both in MySQL innoDB tables.
There is a requirement for two types of pages:
(A) Listing page containing many microblogs along with their votes counts/emoticons counts on single page ( say 25 ).
E.g.
THE GREAT FUNNY POST
Not so funny content in a meant to be funny post. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus euismod consequat pellentesque. …..READ MORE….
(3) likes, (5) bored, (7) smiled
. + 24 More posts on same page.
(B) Permalink page containing a single microblog with detailed vote+vote counts/ emoticons.
THE GREAT FUNNY POST
Not so funny content in a meant to be funny post. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus euismod consequat pellentesque. Quisque viverra adipiscing auctor. Mauris ut diam risus, in fermentum elit. Aliquam urna lectus, egestas sit amet cursus et, auctor ut elit. Nulla tempus suscipit nisi, nec condimentum dui fermentum non. In eget lacus mi, ut placerat nisi.
(You, Derp and 1 more like this), (5) bored, (7) smiled
1st approach:
Table#1:
post_id | post_content | post_title | creation_time
Table#2 for storing votes, likes, emoticons:
action_id | post_id | action_type | action_creator | creation_time
To display a page of posts, or a single post. First table is queried to get the posts,
second is queried to get all the actions related to the posts. Whenever a vote etc is done, an insert is made into the post_actions table.
2nd approach:
Table#1:
post_id | post_content | post_title | creation_time | action_data
Where action_data can be something like { "likes" : 3,"smiles":4 ...}
Table#2:
action_id | post_id | action_type | action_creator | creation_time
To display a page of posts, only first table is queried to get the posts & action data,
to display individual post with detailed actions, second table is queried to get all the actions related to the posts. Whenever a vote etc is done, an insert is made into the post_actions table, and action_data
field of table#1 is updated to store updated count.
Assuming there are 100K posts, and 10x actions I.e. 1 million or more actions created.
Does approach#2 provide a benefit? Any downsides of it apart from having to read, modify and update JSON information?
Is there anyway in which approach#2 can be followed and further improved?
Adding more information based on feedback:
- Python scripts will be reading, writing data.
- MySQL DB servers will be different from web servers.
- Writes due to post creation are low I.e. 10000 per day. But those due to actions can be higher, assume maximum 50 writes per second due to actions like voting, liking, emoticon.
- My concern is about read/write performance comparison of both and gotchas of the second approach, and where it may fall short in future.
I would recommend either storing all likes/votes data (aggregated and atomic) inside of the table 1 and discard table 2 completely OR to use 2 tables without aggregated data while relying on a
JOINsyntax, clever queries and good indexes.Why? Because else you will be querying and writing into both tables all the time when a comment/vote/like is made. Assuming 10 actions per post that are merely for displaying interaction, I’d really store it all into 1 table, maybe making an extra column for each kind of action. You can use JSON or simply
serialize()on the arrays, which should be a bit faster.Which solution you pick in the end will be highly dependant on how many actions you get and how you want to use them. Getting all actions for 1 post is easy with solution 1 and very fast but searching inside would be a mess. On the other hand, solution 2 takes more space, careful query-writing and indexes.