I can’t wrap my head around the following Stata programming problem:
I have a table listing all car purchases by customers and make:
Customer | Make | Price
-----------------------
c1 | m1 | 1
c1 | m1 | 2
c1 | m3 | 1
c2 | m2 | 2
c3 | . | .
I want to transform this into a table with one observation/row per customer, listing the maximum price paid for every make:
Customer | m1 | m2 | m3
-----------------------
c1 | 2 | 0 | 1
c2 | 0 | 1 | 0
c3 | 0 | 0 | 0
How do I achieve this? I know reshape wide, but that doesn’t work because of the doubled c1 | m1 row. Also, the missing values for c3 are causing troubles.
Depending on what you want to do, I suggest approaching this a little differently. For example using -bysort- you can find the maximum price by customer for each make.
Or, you can use collapse to find the max price by customer and make:
But, if you really want the table you posted using -reshape- you could run the following:
Note that reshape will fail if it encounters missing data in the Price column. I dropped these observations in the code above but you may choose to do something different like replace the missing data with zeros as you show in your posted target table.