I have some data organized by quintile labels (-1, 1, 2, 3, 4, 5). For each of these values in a Quintile column, there is a value in another column called ret. Lastly, there is a column of dates containing month-end dates as integers.
My goal is to visualize all of the Quintile returns data at the same time, each as its own column, with only the date column acting like an index.
Essentially, I want to pivot on the Quintile column and I have seen other places advising the use of IF statements in MySQL as a way to achieve this.
For example, here is a query that would show one Quintile’s worth of the data:
select yearmonth, ret
where Quintile=1
from quintile_returns
But I don’t want to repeat this for all Quintile labels, save out the data individually, and piece it together in Python Pandas or Excel or something. I want to make SQL show it as distinct columns.
But when I try this IF statement style poor man’s pivot, this is the query I use:
select yearmonth,
IF(Quintile=1, ret, NULL) as Q1_ret,
IF(Quintile=2, ret, NULL) as Q2_ret
from quintile_returns
I basically get a diagonal of valid data back. All the rows where the Quintile is not 1 still show up, populated with NULL, and then so on for Quintile 2.
How do I avoid all of these extra NULL values? Basically, I want to tell SQL to return the column’s value only if the condition is satisfied, and do not use NULL or anything else as a default else-like placeholder.
Is there a way to do this that does not involve nested join-type statements?
You can use
GROUP BYto only show one row for eachyearmonthvalue, and then SUM() along with your IF() statements so that theretVALUES are only summed when the columns’ IF() condition evaluates to TRUE:Otherwise, you had the right idea with the IF() statements.