I’m having a problem with R code, rather, with missing values. Don’t know actually, how to impute those values using simple Hot Deck method. Like, example, having these data.
1 10000123 111 112820 0.24457235 NA NA NA NA 11
2 10000132 111 2502357 0.19408587 0.19373610 0.6567305 0.01454520 0.13498823 69
3 10000388 111 4472360 0.14774927 0.14918678 0.6853377 0.05233508 0.11314044 106
4 10000792 111 666909 0.10520063 NA NA NA NA 14
5 10002737 111 1139613 0.19944986 0.20114918 0.3564355 0.20135391 0.24106136 23
6 10002741 111 981574 0.11573570 NA NA NA NA 13
7 10002929 111 1417192 0.08770932 0.08387991 0.6106012 0.11078473 0.19473415 24
8 10003396 111 444966 0.19026263 0.18784110 0.5215772 0.16844381 0.12213789 24
9 10003517 111 1230589 0.16393216 0.16358568 0.4614005 0.26670712 0.10830670 19
10 10003546 111 760847 0.12384748 NA NA NA NA 10
Using 5th column, need to find the nearest value, and then fill with that similar respondent in those places, where are NA values.
Thank You.
I’ve never used hot (or cold for that matter) deck sampling. However a little Googling led me to the
rrp.imputefunction in the rrp package.Here’s a simple example using some synthetic data:
This uses the default value of k=1, which is what I think you want. The pretty picture at the end looks like this:
The red circles are the imputed values and you can see they are simply the nearest neighbor.