I have come across a few questions on the same lines but none exactly the same.
Which one is better, performing some basic calculations in PHP after getting results from MySql (by joining a few tables) or physically having another column in MySql which stores the total while inserting a new row and then retrieving it.
E.g: A product sold:
Item Price Quantity Discount
Item 1 55 100 10%
The above is a sales table, the price column is joint from the items table. Based on the above question either we can use PHP to fetch results, perform Price X Quantity X 0.10 or we our table can look like the following:
Item Price Quantity Discount Amount
Item 1 55 100 10% 4950
Now which is the better way of doing such a simple task?
Similar Question:
Doing Calculations in MySql vs PHP
There is no blanket rule for all situations. Many factors affect the performance and efficiency of websites. So there’s no single ‘Best’.
If you look at something like Magento, it does it both ways. On the one hand it has a full EAV structure with every piece of data abstracted out and normalised to the nth degree. On the other hand, it also aggregates pre-calculated values in flat tables for performance reasons. This includes discount amounts, base prices, tax quantities (in base and chosen currency), etc. The former situation is best in terms of flexibility and robustness, the flat table is better in terms of performance.
A flat table obviously makes it faster when dealing with bulk calculations, as everything has already been worked out. But it does, as kernelpanic pointed out, mean that any changes to settings may require a bulk recalculation of every value. In the case of historical data such as order history, you probably won’t want to recalculate the actual amounts people ended up paying, but the possibility of having to do so does need to be taken into consideration when determining the best solution.
If performance is paramount and the calculations are expensive to run, then knowing that you may have to refresh the values in bulk from time-to-time allows you to make an informed decision to cache it or not.
But if it’s not a performance critical aspect, or the calculations are expensive but not run often, it’s cleaner to leave them out of the database as they really belong in the business logic processing part of an application i.e. the code.
Again there is more than one way of defining “best”, so it depends on the circumstances. It is really just a matter of balancing requirements – speed, cleanliness, memory usage, processor requirements, disk space usage, the need to fit into some arbitrary data structure defined by development managers – your decision will need to account for these factors.
Without a real-world problem to address, speculation is really all that can be given. If you do have a more complex situation, I’d be happy to take a look and offer my thoughts.
edit: From my own observations, a Magento catalog page with flat data and over 200k products loads in about 10 – 20 seconds with no page caching enabled. When flat data was disabled and the EAV structure was used, it would take minutes. I’m not at work right now so I don’t have my profiling data handy, but it’s a testament to the fact that in real world applications there is no single best solution.