I have a need to migrate game score table from (don’t laugh please…) *.ini database to SQL tables, as I want to migrate the whole game to MySQL base.
The need is to store user scores in database, to be able retrive result tables for overall/year/mont/week/day time spans. That takes each year: {year score} + {month score}*12 + {week score}*52 + {day score}*360 = ~425 rows per user + 1 {overall score} row per user. That does not feel optimized, that is why I am here with this question.
What base to use? Mentioned above, is base for timespans, using structure like this:
{timespan type} {timespan} {user ID} {score for type 1} {score for type 1} {score for type 1} {score for type 2}
Another thing that I need to note. There are different methods for different score types, to sort. If first type is plain score, where I take highest (more – better), then second type is speed, where I sort by fastest speed (smaller – better).
If you have question: “Why seperate rows for weeks/months/years/overall?”, then answer is, that I want some fast way to get a results table, for example, for last weeks score type 2, first 3 places.
I was thinking, maybe, if I store only {day score}, getting rid of those 426-360=66 rows per user in current structure, leading to new structure:
{user ID} {day number} {score for type 1} {score for type 1} {score for type 1} {score for type 2}
How will I get out “First 3 places for best score in speed, for previous week”. That would take some multi-level calculations…
- Query for rows with days, that goes into time span for previous weeks for ALL users
- Sum all scores for each {user ID} or get lowest score (depending on score type) and put in a new table
- Query new table, sorting by score column ASC or DESC (depending on score type) and retreive first 3 rows
- Repeat steps 2. & 3. for each score column
If I want to do that more than 1-3 times a minute (as after each new input in table, I need to evaluate, how much points for specific user is to get in higher standing, I imagine, that would take some serious server resources. As of structure given at top of question (timespan base), that only takes step 3. for each score column.
Thanks for answers & suggestions!
The current format of data follows:
Overall file: overall.ini
(in folder of yearnumber) Year file: 2011.ini
(in folder of yearnumber) Month files: m1.ini ... m12.ini
(in folder of yearnumber) Week files: n1.ini ... n52.ini
(in folder of yearnumber) Day files: m1d1.ini ... m12d1.ini
Data stored inside:
[~REZ~]
User13245325=1145 203.433 3 1.735
User3425435=1412 173.871 8 2
User32487854=18 76.253 1 11.016
User345645=2153 155.139 8 2.344
User65875=100 67.767 2 10.016
User453325=26 138.568 1 3.031
PS: This is general question, not directly related to game development, so please, dont toss it away to gamedevelopment SO section.
To be honest, I am really struggling to understand what you’re asking here – but if I understand correctly, there’s some other system in which the actual game outcomes are recorded, and you want to turn that game data into a leaderboard-style data structure, and you want your queries to run really, really fast.
So, first off – you seem to worry about the size of your data. Unless you’re dealing with absolutely astronomical scale (Google, Facebook, Twitter), you probably don’t need to. Disk space is cheap, and a well-indexed database is just as quick with millions of rows as it is with dozens.
Broadly speaking, you need to decide whether you’re going to trade space (pre-computing the results) for speed, and when you’re going to do the calculations.
In general, working with “raw” data and computing the results at run time is the easiest to understand and maintain, and has the lowest risk of bugs – but is also likely to be the slowest. Nevertheless, that’s where I’d start. You don’t give an indication of how your “raw” data is stored, or how frequently it updates, but I’d start by writing a query that produces your intended data. If it’s tricky, I’d factor some of the intermediate steps into views to simplify the query.
Then, I’d measure performance, using a load testing tool (JMeter or similar is perfect).
If – but only if – it’s really too slow, I’d start by gradually turning the views into pre-computed tables, by introducing regular batch jobs to populate those views. This depends entirely on your data and how “stale” it’s allowed to be.
You can usually get very good performance gains out of this approach – and the solution remains fairly simple and resistant to bugs, as long as the batch jobs run…
Only when that approach runs into performance bottlenecks would I consider pre-computing the whole data set. In that case, you may as well create a table that is exactly like the one you want to output to the screen.