Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8566323
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T17:37:45+00:00 2026-06-11T17:37:45+00:00

I have a data set base_data which has missing values. I have therefore used

  • 0

I have a data set “base_data” which has missing values. I have therefore used the package ‘Amelia’ to impute the missing values into an object “a.output”.

I have been able to find the mean for some variables within the imputed results using the following code:

q.out<-NULL
se.out<-NULL
for(i in 1:m) {   
dclus <- svydesign(id=~site, data=a.output$base_data[[i]]) 

q.out <- rbind(q.out, coef(svymean(~hh_expenditure, dclus)))
se.out <- rbind(se.out, SE(svymean(~hh_expenditure, dclus)))}

I have combined the results using:

svymean.combine <- mi.meld(q = q.out, se = se.out)

Which gives me the mean and standard error for household expenditure (hh_expenditure) across the population.

However I have a variable which splits the population into wealth quintiles (wealth_quin).

As such, I am now wanting to find the average, and standard error, of the household expenditure per wealth_quin (a variable which is either 1,2,3,4,or 5).

I initially tried subsetting the imputed data, but this came up with many errors.

Is there a way to do this without having to split up the data into the 5 wealth quintiles before imputing the data?

Cheers,

Timothy

EDIT: HERE IS A WORKABLE EXAMPLE

require(Amelia)
require(survey)
a<-as.data.frame(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16))
b<-as.data.frame(c(1,2,2,1,2,1,1,2,1,2,2,1,1,2,1,2))
c<-as.data.frame(c(2,7,8,5,4,4,3,8,7,9,10,1,3,3,2,8))
d<-as.data.frame(c(3,9,7,4,5,5,2,10,8,10,12,2,4,4,3,7))
e<-as.data.frame(c(2500,8000,NA,4500,4500,NA,2500,NA,7400,9648,1112,1532,3487,3544,NA,7000)

impute<-cbind(a,b,c,d,e)
names(impute) <- c("X","site","var2","var3", "hh_inc") 

so no we have a data frame to work with, with missing values for hh_inc which I want to impute.
first step, set the number of imputations

m<-5

now run the imputation:

a.output <- amelia(x = impute, m=m, autopri=0.5,cs="X",
               idvars=c("site","var2"),
               logs=c("hh_inc","var3"))

a.output is now holds the data from the 5 imputations.

What I now want to do is find the average (and standard error) hh_inc for site 1 and site 2 separately using the imputed values from amelia.

How is that possible to do? I know it is possible to do if I just ignore the NA’s. But this might introduce bias, hence why I imputed the values in the first place.

Cheers,

Timothy

EDIT:
I have placed a bounty to this. If no one knows the exact way to do it, then the results from the individual imputed data sets can be combined using Rubins formula (http://sites.stat.psu.edu/~jls/mifaq.html#minf)
As such, I will award to bounty to someone who can transform the 5 separate imputed datasets from the Amelia object into 5 separate, complete, data frames.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T17:37:46+00:00Added an answer on June 11, 2026 at 5:37 pm
    require(Amelia)
    require(survey)
    require(data.table)
    require(plotrix)
    
    a<-as.data.frame(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16))
    b<-as.data.frame(c(1,2,2,1,2,1,1,2,1,2,2,1,1,2,1,2))
    c<-as.data.frame(c(2,7,8,5,4,4,3,8,7,9,10,1,3,3,2,8))
    d<-as.data.frame(c(3,9,7,4,5,5,2,10,8,10,12,2,4,4,3,7))
    e<-as.data.frame(c(2500,8000,NA,4500,4500,NA,2500,NA,7400,9648,1112,1532,3487,3544,NA,7000))
    
    impute<-cbind(a,b,c,d,e)
    names(impute) <- c("X","site","var2","var3", "hh_inc") 
    
    summary(impute)
    
    
    m <- 5
    a.output <- amelia(x = impute, m=m, autopri=0.5,cs="X",
                   idvars=c("site","var2"),
                   logs=c("hh_inc","var3"))
    
    stats.out <- NULL
    for(i in 1:m){
    df2 <- data.table(a.output$imputations[[i]])
    df3 <-  data.frame(dataset=i,df2[,list(std.error(hh_inc),mean(hh_inc)), by="site"])
    stats.out <- rbind(stats.out, df3)
    }
    colnames(stats.out) <- c("dataset","site","stdError","mean")
    stats.out
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a large set of data from which the user has to select
I have a data set with over 50,000 geocoded points (lat-long). Each point has
I have a data set which consists of an ID and a matrix (n
I have a set of classes that are used to prepare data for serialization.
I have a data base which has various columns among which are the columns
I have a data set comprised of 5 minute price observations (not an xts
I have a data set that resembles this: id product_id size color price created_date
I have a data set that is around 700 rows with eight columns of
I have a data set that that I would like to call in a
I have a data set of item difficulties that correspond to items on a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.