I need to calculate the Gini coefficient from disposable personal income data at LIS. According to a LIS training document, the Stata code to do this is:
di "** INCOME DISTRIBUTION II – Exercise 13 **"
program define bottop
qui sum ey [w=hweight*d4]
replace ey = .01*r(mean) if ey<.01*r(mean)
qui sum dpi [w=hweight*d4], de
replace ey = (10*r(p50)/(d4^.5)) if dpi>10*r(p50)
end
foreach file in $us00h $fi00h {
display "`file'"
use hweight d4 dpi if (!mi(dpi) & !(dpi==0)) using "`file'", clear
gen ey=dpi/(d4^0.5)
bottop
ineqdeco ey [w=hweight*d4]
}
I have simply copied and pasted this code from the training document. The snippets
qui sum ey [w=hweight*d4]
replace ey=0.01*r(mean) if ey<0.01*r(mean)
and
qui sum dpi [w=hweight*d4], de
replace ey=(10*r(p50)/(d4^0.5)) if dpi>10*r(p50)
are bottom and top coding, respectively.
When I tried to run this code, the variable hweight was not found. Does anyone know what the new name of hweight is at LIS? Or can anyone suggest how I might otherwise overcome this impasse?
I’m familiar with stata, but the sophistication of this code is beyond my ken.
Much appreciated.
This is more of a second-best solution. However, the census of population provides income by brackets. If you are willing to do that, you can get the counts for every bracket. Have a top-coded bracket for the last one. Use the median income value within each bracket. Then you can directly apply the formula for the Gini coefficient. It is a second best because it is an approximation for the individaul-level data.