I have a MySQL table to store user statistics with >2MM rows and 8 columns and an index on the userID. When the user visits his profile a lot of those informations are retrieved from the database resulting in – worst case – a couple of dozen SELECT queries sometimes joined with other tables. It is similar to the profile on SO which has to pull a lot of data as well.
Some queries like to get the user score require a COUNT and other performance eating MySQL functions. So just the queries for the profile page can take up to 10-20 seconds.
Now my questions:
- How does a website like SO pull so many informations that quickly?
- Do I need a caching layer?
- Should I precalculate the count of
the score, etc. that consume MySQL performance? - Should I use one
writing optimized table and one reading optimized table? If so how
could I retrieve live data like on SO? - Should I move away from MySQL?
They give you an illusion of pulling “so much” data, but in reality they get one chunk at a time. Basically, you would need one query to get the COUNT of the results, and another query to get the data for any page within that resultset. The result set is not cached. The parameters to the query may be cached. But it is best to implement in a way the query parameters are supplied with every page request.
May be. But not necessary to implement this. Caching may be useful to store the previously fetched pages, and the same pages are used by entire community, rather than one user.
Not relevant for this use case.
Not required.
Not required.
For more details on this concept, lookup paginated Ajax grids