I have data on oil shipments and I need to create trade flows of these (origin-destination by country and area). For example, sum up all the shipments going from Saudi Arabia, Arabian Gulf to USA, US Gulf.
I can do this using proc tabulate, but I want to create a variable. My variables are: LoadCountry, LoadArea, DischargeCountry, DischargeArea. Also, LoadCountry or DischargeCountry can be listed multiple times if the Area is different so a distinct flow includes all four vars.
I should be able to use PROC SQL but I can’t figure out how to GROUP BY several variables to create the aggregate sum:
proc sql;
title 'LoadCountry-LoadArea-DischargeCountry-DischargeArea Trade flows';
create table data.TradeFlow as
select LoadCountry, LoadArea, DischargeCountry, DischargeArea,
sum(CargoSize) as TotalCargo
from data.allvars1
Group by LoadCountry
Order by LoadCountry, DischargeCountry;
quit;
Any help is most appreciated.
If I understand you correctly, you’re almost there… Just add the other three variables to your GROUP BY clause: