Guided Multiple Imputation
A detailed discussion of the guided multiple imputation technique used in AHS-2 research is available in:
Fraser, G.; Yan, R.
Guided multiple imputation of missing data: using a subsample to strengthen the missing-at-random assumption.
Epidemiology. 2007 Mar; 18:246-52
Pubmed ID: 10.1097/01.ede.0000254708.40228.8b (Free full-text is available.)
Abstract
Multiple imputation can be a good solution to handling missing data if data are missing at random. However, this assumption is often difficult to verify. We describe an application of multiple imputation that makes this assumption plausible. This procedure requires contacting a random sample of subjects with incomplete data to fill in the missing information, and then adjusting the imputation model to incorporate the new data. Simulations with missing data that were decidedly not missing at random showed, as expected, that the method restored the original beta coefficients, whereas other methods of dealing with missing data failed. Using a dataset with real missing data, we found that different approaches to imputation produced moderately different results. Simulations suggest that filling in 10% of data that was initially missing is sufficient for imputation in many epidemiologic applications, and should produce approximately unbiased results, provided there is a high response on follow-up from the subsample of those with some originally missing data. This response can probably be achieved if this data collection is planned as an initial approach to dealing with the missing data, rather than at later stages, after further attempts that leave only data that is very difficult to complete.
Approach
The following article provides a good summary of the practical aspects of working with multiply-imputed data:
Marshall A, Altman DG, Holder RL, Royston P. **Combining estimates of interest
in prognostic modelling studies after multiple imputation: current practice and
guidelines.** BMC Med Res Methodol. 2009 Jul 28;9:57. doi: 10.1186/1471-2288-9-57.
PubMed PMID: PMC2727536.
Of particular use is the description of Rubin's Rules for combining multiply imputed estimates.