I know that MAPE and WMAPE as a forecast error metrics, they have some benefits. But what’s the gaps? Someone says:
For MAPE: "Combinations with very small or zero volumes can cause large skew in results" And for WMAPE: "Combinations with large weights can skew the results in their favor"
I can’t understand, can anyone explain the two statements for the weakness of the two metrics? Thanks.
For MAPE, Mean absolute percentage error [1], suppose we denote the actual value with A, and predicted value with P. You have a series of data at time 1 thru n, then
Since A(t) is in the denominator, whenever you have a very small or near-zero A(t), that division is like one divided by zero which creates very large changes in the Absolute Percentage Error. Combinations of such large changes causes large skew in results for sure.
For WMAPE, Weighted mean absolute percentage error,
Since this is a weighted measure, it does not have the same problems as MAPE, e.g., over-skewing due to very small or zero volumes.
However, a weighting factor would indicate the subjective importance we wish to place on each prediction [2].
This is how a favor of more recent data skews the results.