Method: A Monte Carlo computer simulation was used to evaluate the performance of two potential Maximum Likelihood methods for estimating a mediation model. In the first analysis approach, the Auxiliary Variable Model, the reliable mediator is the mediator in a single mediator model and the unreliable measure is used as an auxiliary variable to increase power and accuracy. In the second maximum likelihood approach, the Latent Indicator Model, both the reliable and unreliable measures are included in the model as latent variable indicators of a Latent Mediator Variable. Three mediation effect sizes (small, medium and large), four sample size conditions (N’s between 100 and 5000), 2 missing data rates (small and large), and two estimation models (Auxiliary Variable Model vs. Latent Indicator Model) were studied.
Results: The mediated effect is severely underestimated in the Auxiliary Variable Model and the estimates in the Latent Indicator condition are much less biased. However the Latent Indicator results may be biased as nearly 1/3 of of the Latent Variable Indicator replications were dropped due to a non-positive definite covariance matrix. For small sample sizes and small mediated effects, the two methods were severely underpowered in terms of detecting the mediated effect. For example, for small mediated effect sizes with N = 200 and a 50% missing data rate, power for both models ranged from 2.1% to 4.4 depending on the exact parameter specification simulated. Fortunately, for medium to large effect sizes and large sample sizes, these models are sufficiently powered (power varied depending on the exact model specification).
Conclusions More simulation conditions would allow for more overarching conclusions about the alternative methods. Although these models have limitations, the results are promising for selection of planned missing data models that maximize power and accuracy. Researchers interested in designing a planned missing data study would be prudent to conduct a simulation study to investigate data collection based on the theorized parameters that will minimize costs and increase the power and accuracy of estimates.