Review and evaluation of imputation methods for multivariate longitudinal data with mixed-type incomplete variables

Stat Med. 2022 Dec 30;41(30):5844-5876. doi: 10.1002/sim.9592. Epub 2022 Oct 11.

Abstract

Estimating relationships between multiple incomplete patient measurements requires methods to cope with missing values. Multiple imputation is one approach to address missing data by filling in plausible values for those that are missing. Multiple imputation procedures can be classified into two broad types: joint modeling (JM) and fully conditional specification (FCS). JM fits a multivariate distribution for the entire set of variables, but it may be complex to define and implement. FCS imputes missing data variable-by-variable from a set of conditional distributions. In many studies, FCS is easier to define and implement than JM, but it may be based on incompatible conditional models. Imputation methods based on multilevel modeling show improved operating characteristics when imputing longitudinal data, but they can be computationally intensive, especially when imputing multiple variables simultaneously. We review current MI methods for incomplete longitudinal data and their implementation on widely accessible software. Using simulated data from the National Health and Aging Trends Study, we compare their performance for monotone and intermittent missing data patterns. Our simulations demonstrate that in a longitudinal study with a limited number of repeated observations and time-varying variables, FCS-Standard is a computationally efficient imputation method that is accurate and precise for univariate single-level and multilevel regression models. When the analyses comprise multivariate multilevel models, FCS-LMM-latent is a statistically valid procedure with overall more accurate estimates, but it requires more intensive computations. Imputation methods based on generalized linear multilevel models can lead to biased subject-level variance estimates when the statistical analyses involve hierarchical models.

Keywords: chained equations; joint modeling; longitudinal analysis; multiple imputation.

Publication types

  • Review
  • Research Support, N.I.H., Extramural

MeSH terms

  • Biometry* / methods
  • Computer Simulation
  • Humans
  • Longitudinal Studies
  • Models, Statistical*
  • Research Design
  • Software