I need to calculate the median value in MySQL. I saw the solution here.
But, I didn’t understand part of it. The solution provided enter code here is as follows:
SELECT x.val from data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2
What is data x and data y in the context of the original question? Usually FROM is followed by the table name. But, then why are 2 tables listed when the question refers to only one? Can someone explain how this solution works? Also, I didn’t understand this part: HAVING SUM(SIGN(1-SIGN(y.val-x.val))) .
In the original question,
data x, data yjoins the table to itself, creating a cartesian product. The original table had 7 rows, and by joining every row against every other row, the resulting product is 49 rows.Essentially, this function determines for every value how many values are less than the one being examined. It then compares this total to half the count + 1… and then selects that value as the median.
It does this by subtracting the value (
x.val) from the value it is comparing (y.val). It then uses theSIGNfunction to convert the result to-1,0, or1. It then subtracts this value, and then takes theSIGNagain. So if they.valvalue is less than thex.valvalue it that is being compared to, the end result would be a1. For example, let’s sayyis3, andxis5.If
ywere5, andxwas3… the end result would be0:Summing the results of these comparisons gives us a number that indicates how many values come before the value that we’re examining. It then compares this
SUMagainstCOUNT(*) + 1 / 2to find the middle range…