We are looking to train R with structures like:
age, data1, data2, … dataN, actions
where N depends on the amount of data we have about a person.
Our goal is to determine how likely is it that another person would generate actions by querying on all the data we have him/her.
age, data1, data2, …dataM where M could be bigger or smaller than N.
With complete data-sets we could have used binary logistic regression. But we need to use partial sets.
What’s the best way to calculate the likelihood that a person performs actions by asking with partial data sets?
The Hmisc package provides several multiple imputation functions, providing a means to gain more complete use of the information that is present in your data.
The accompanying package, rms, has a binary logistic regression function: